DataMining Project

  • Status Encerrado
  • Budget $10 - $30 USD
  • Total Bids 7

Descrição do Projeto

There are two sets of Wikipedia articles. The first set is from Wikipedia featured articles of a

certain type. The first set becomes class Featured. The second set of articles are Wikipedia (non-

featured) articles of similar type to featured articles. The second set becomes class Non-Featured.

We are dealing with a binary classification problem. 

To create attributes, extract all possible tokens from the entire dataset after stemming and stop-

word removal. Create 1-gram, 2-gram and 3-grams from these tokens. Use these n-grams as the

attributes for ARFF files. 

Perform attribute selection on each of 1-gram, 2gram, 3-gram an using information gain and gain

ratio. Perform classification using decision tree, and naïve Bayes. 

Make a Wiki report on your finding including various statistical evaluation measures given by WEKA for each classifier.

Obter orçamentos grátis para um projeto como este
Habilidades Requeridas

Pretende fazer algum dinheiro?

  • Definir seu orçamento e prazo
  • Esboce sua proposta
  • Seja pago por seu trabalho

Contratar freelancers que também ofereça neste projeto

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online