Find Jobs
Hire Freelancers

Developing a Text Information Retrieval System Phase II

$30-250 USD

Em Andamento
Publicado há quase 9 anos

$30-250 USD

Pago na entrega
Information retrieval is the process of extracting useful information from data. In the current era, text constitutes an important form of data. This includes web pages, emails, SMS messages and several other text documents types. Text documents need to be represented in an appropriate format (usually in the form of vectors of numbers) in order to be used for further processing. Once properly repre- sented, text documents can be used for various tasks such as classification, for instance, deciding whether an email is a spam, or search, for example, deciding whether two web pages have similar content. Before representing documents as numbers, however, they must be preprocessed. Text preprocessing is the tasks of removing unnecessary information from the text. This is achieved through several steps, which are summarized hereafter (see also Figure 1). 1. Initial preprocessing: The goal of this step is to ”clean up” the document and prepare it for the remaining tasks. The different tasks conducted in this step are: (a) Replace tabulation, return and new line by space. (b) Remove all non-letter characters: turn punctuation, numbers, etc. into spaces. (c) Switch all letters to lowercase. (d) Substitute multiple spaces by a single space. (e) Remove words that are shorter than 3 characters long. For example, remove ”an” but keep ”him”. 2. Stop words removal: Some words such as ”a”, ”the”, ”and” are very common in English and should be removed from the text in order to only leave useful words. This task is simply done by removing any word that appears in a predefined list of stop words. 3. Stemming: The same word can take different forms depending on its role and position in the sentence. For instance, the words :”wanted”, ”wanting” and ”wants” are all variations of the word ”want”. The word ”want” is called the stem, and the The second step in this project is to represent documents as vectors and improve the performance of stop words removal and stemming.
ID do Projeto: 7545369

Sobre o projeto

12 propostas
Projeto remoto
Ativo há 9 anos

Quer ganhar algum dinheiro?

Benefícios de ofertar no Freelancer

Defina seu orçamento e seu prazo
Seja pago pelo seu trabalho
Descreva sua proposta
É grátis para se inscrever e fazer ofertas em trabalhos
Concedido a:
Avatar do Usuário
Hi i am an expert java programmer. i have been working on java for last 2 years. I have through understanding of java utilities, collections, swing, jdbc and rpc etc. I have also good understanding of text mining and have also worked on many projects of text processing before. i will do this job for [login to view URL] you give me an opportunity to do this job for you, you will find me with in time and budget. looking forward for your response Thanks
$144 USD em 4 dias
4,9 (4 avaliações)
2,4
2,4
12 freelancers are bidding on average $185 USD for this job
Avatar do Usuário
Hello I am Java expert and interested in this project. I have reviewed the details and confident to handle this project perfectly. I also have a lot of experience in helping students with assignments. Please communicate to discuss further. Regards Anshu
$150 USD em 2 dias
4,7 (297 avaliações)
7,2
7,2
Avatar do Usuário
I have 10+ years experience and more than 600 projects completed on this platform. I have experience in Natural Language Processing. I can handle both phases of the project. For Phase 1, I need 10 days (750$) and for phase 2 I need 5 days (350$). I have recently completed another NLP project and I can send a demo to you. Please get back to me and we can discuss further. I am very interested to work on this project. Ready to start ASAP.
$155 USD em 3 dias
4,9 (369 avaliações)
7,1
7,1
Avatar do Usuário
A proposal has not yet been provided
$222 USD em 3 dias
5,0 (6 avaliações)
3,2
3,2
Avatar do Usuário
I have 7+ years experience in Java Development with on hand experience in Collections, Data structures, OOP, Generics etc. Excellent debugging skills and write code with coding standards and design patterns
$166 USD em 5 dias
4,9 (7 avaliações)
3,0
3,0
Avatar do Usuário
Hi, I am Software Engineer. I have skills and experience in Java, C/C++, C# and other programming languages. I have done recent projects in Java. I can this work. I deliver my work on time. I will wait for your reply. Regards, Naveed Ahmed
$166 USD em 3 dias
5,0 (3 avaliações)
2,3
2,3
Avatar do Usuário
I have experience of IRE(Information Retrieval System). I have done two similar project in past 1. Wikipedia search Engine 2. Tweets Classification in java.
$166 USD em 3 dias
0,0 (0 avaliações)
0,0
0,0
Avatar do Usuário
Please find below my short experience summary. * Several years experience developing Data Mining, Text Mining, Information Retrieval and Extraction, NLP, Machine Learning, Knowledge Discovery and Analytics for web crawling, scraping, extraction and aggregation from unstructured big data such as web-pages and text corpus. * Have worked extensively on Data Mining/Machine Learning techniques for automatically processing, classifying, predicting, clustering, categorization, text analysis ( pre-processing, stemming, stop-words filtering, tfidf ) and assembling and populating them into databases, datastores and search-indexes(Lucene, Solr) for analysis, search, reporting and dashboard. * Have independently completed the projects undertaken before in developing Information Extraction, Web Crawling, Scraping, Data Mining, Analytics, Reporting, Dashboard and Statistical Tools. * Extensive experience using Perl, PHP, C, Java, .NET with MySql, Oracle, MS-SQL Server * Data Mining / Machine Learning / Information Extraction Tools : Weka, R, Excel, Perl-CPAN Packages for Extraction. Estimated Budget : 280 USD ( 6-8 days ) Price,milestones and timelines flexible and negotiable based on exact project specifications and details.
$280 USD em 6 dias
0,0 (0 avaliações)
0,0
0,0
Avatar do Usuário
A proposal has not yet been provided
$155 USD em 3 dias
0,0 (0 avaliações)
0,0
0,0

Sobre o cliente

Bandeira do(a) SAUDI ARABIA
Saudi Arabia
0,0
0
Membro desde abr. 24, 2015

Verificação do Cliente

Obrigado! Te enviamos um link por e-mail para que você possa reivindicar seu crédito gratuito.
Algo deu errado ao enviar seu e-mail. Por favor, tente novamente.
Usuários Registrados Total de Trabalhos Publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Carregando pré-visualização
Permissão concedida para Geolocalização.
Sua sessão expirou e você foi desconectado. Por favor, faça login novamente.