Developing a Text Information Retrieval System "project for college"

Concluído Postado Mar 25, 2015 Pago na entrega
Concluído Pago na entrega

Introduction

Information retrieval is the process of extracting useful information from data. In the

current era, text constitutes an important form of data. This includes web pages, emails,

SMS messages and several other text documents types.

Text documents need to be represented in an appropriate format (usually in the form

of vectors of numbers) in order to be used for further processing. Once properly repre-

sented, text documents can be used for various tasks such as classi cation, for instance,

deciding whether an email is a spam, or search, for example, deciding whether two web

pages have similar content.

Before representing documents as numbers, however, they must be preprocessed. Text

preprocessing is the tasks of removing unnecessary information from the text. This is

achieved through several steps, which are summarized hereafter

1. Initial preprocessing: The goal of this step is to "clean up" the document and

prepare it for the remaining tasks. The di erent tasks conducted in this step are:

(a) Replace tabulation, return and new line by space.

(b) Remove all non-letter characters: turn punctuation, numbers, etc. into spaces.

(c) Switch all letters to lowercase.

(d) Substitute multiple spaces by a single space.

(e) Remove words that are shorter than 3 characters long. For example, remove

"an" but keep "him".

2. Stop words removal: Some words such as "a", "the", "and" are very common in

English and should be removed from the text in order to only leave useful words.

This task is simply done by removing any word that appears in a prede ned list of

stop words.

3. Stemming: The same word can take di erent forms depending on its role and

position in the sentence.

Java Arquitetura de software Teste de Software

ID do Projeto: #7372961

Sobre o projeto

6 propostas Projeto remoto Ativo em Mar 28, 2015

Concedido a:

dobreiiita

Hello I am Java expert and interested in this project. I have reviewed your requirements and confident to handle this project perfectly. Please communicate to discuss further. Regards Anshu

$54 USD em 1 dia
(319 Comentários)
7.2

6 freelancers estão ofertando em média $71 nesse trabalho

wanly3643vw

I have done similar projects before so I think I could help.

$60 USD em 1 dia
(48 Comentários)
4.5
Nawaz091993

Greetings! I have passed the Information retrieval course with 4.0 and I have already completed a similar task as assignment using java lucene library. If you are allowed to use a java library, I can complete this task Mais

$55 USD in 3 dias
(4 Comentários)
2.6
feezaK

3 years of hands on experience in java. Timely delivery and effiency is guaranteed. Let me do this fo you.

$35 USD in 3 dias
(0 Comentários)
0.0
dForDevelopment

Hello. This is not really compicated task. I can do it just because I need to improve my freelancer reputation. So if you are interested in a quick solution - let me know.

$155 USD in 3 dias
(0 Comentários)
0.0