Classifying Texts and Documents - Natural Language Processing with Java

Java Reference

In-Depth Information

Summary

In this chapter, we discussed the issues surrounding the classification of text and examined

several approaches to perform this process. The classification of text is useful for many

activities such as detecting e-mail spamming, determining who the author of a document

may be, performing gender identification, and language identification.

We also demonstrated how sentiment analysis is performed. This analysis is concerned

with determining whether a piece of text is positive or negative in nature. It is also possible

to assess other sentiment attributes.

Most of the approaches we used required us to first create a model based on training data.

Normally, this model needs to be validated using a set of test data. Once the model has

been created, it is usually easy to use.

In the next chapter, we will investigate the parsing process and how it contributes to ex-

tracting relationships from text.

Search WWH ::

Custom Search

Home