Data Extractor / Web Data Miner We are building next generation intelligent web search applications. We are looking for a brilliant software engineer with strong expertise in text mining, information extraction, information retrieval and natural language processing to help us build a product database that will allow us to dominate the "long tail" of products. - Responsible finding practical and innovative ways to extract product information from a variety of unstructured sources.
- Hands on implementation of data extraction.
- Develop tools and processes for web data mining: from web and off-line sources, identify topics, classify/cluster, extract named entities.
- Find little known "off the beaten path" products on little known "out of the way" websites.
- Work with our team to deliver the best possible experience for our users.
- Experience in information extraction and integration. Harvesting and extracting information from structured and unstructured data.
- Familiar with statistical methods for data analysis, such as PMI, HMM, Naive Bayes, etc.
- Experience in machine learning algorithms related to search and personalization, large scale web clustering, classification and summarization.
- Experience in large scale crawling, deep web crawling, knowledge of tools like Nutch, Hadoop.
- 5+ years experience in Java development, strong programming skills.
- Experience in large scale recommendation system, content-based recommendation and collaborative filtering.
- Expert knowledge of relational database, performance tuning of large-scale databases or large-scale file stores, and experience in MySQL is preferred.
- Experience in test-driven and agile software development.
- Feel comfortable with fast-paced development in the environment of small startup.
- Natural language processing skills and experience is a big plus.
工作地点:北京
薪水面议
有意者请尽快将个人简历发至carol.ha@usense.com.cn
谢谢~~~ |