Introduction to NLP - Natural Language Processing with Java

Java Reference

In-Depth Information

Building and training the model

Training a model is the process of executing an algorithm against a set of data, formulating

the model, and then verifying the model. We may encounter situations where the text that

needs to be processed is significantly different from what we have seen and used before.

For example, using models trained using journalistic text might not work well when pro-

cessing tweets. This may mean that the existing models will not work well with this new

data. When this situation arises, we will need to train a new model.

To train a model, we will often use data that has been "marked up" in such a way that we

know the correct answer. For example, if we are dealing with POS tagging, then the data

will have POS elements (such as nouns and verbs) marked in the data. When the model is

being trained, it will use this information to create the model. This dataset is called a cor-

pus .

Search WWH ::

Custom Search

Home