Java Reference
In-Depth Information
to work with a professional Realtor and find your perfect
home.
Best Category: misc.forsale
For the martinLuther text, we get the following output:
Text: Luther taught that salvation and subsequently
eternity in heaven is not earned by good deeds but is
received only as a free gift of God's grace through faith
in Jesus Christ as redeemer from sin and subsequently
eternity in Hell.
Best Category: soc.religion.christian
They both correctly classified the text.
Sentiment analysis using LingPipe
Sentiment analysis is performed in a very similar manner to that of general text classifica-
tion. One difference is the use of only two categories: positive and negative.
We need to use data files to train our model. We will use a simplified version of the senti-
ment analysis performed at http://alias-i.com/lingpipe/demos/tutorial/sentiment/read-
me.html using sentiment data found developed for movies ( http://www.cs.cornell.edu/
people/pabo/movie-review-data/review_polarity.tar.gz ). This data was developed from
1,000 positive and 1,000 negative reviews of movies found in IMDb's movie archives.
These reviews need to be downloaded and extracted. A txt_sentoken directory will
be extracted along with its two subdirectories: neg and pos . Both of these subdirectories
contain movie reviews. Although some of these files can be held in reserve to evaluate the
model created, we will use all of them to simplify the explanation.
We will start with re-initialization of variables declared in the Using LingPipe to classify
text section. The categories array is set to a two-element array to hold the two cat-
egories. The classifier variable is assigned a new DynamicLMClassifier in-
stance using the new category array and nGramSize of size 8:
categories = new String[2];
categories[0] = "neg";
categories[1] = "pos";
nGramSize = 8;
Search WWH ::




Custom Search