Java Reference
In-Depth Information
Training the OpenNLP POSModel
Training an OpenNLP POSModel is similar to the previous training examples. A training
file is needed and should be large enough to provide a good sample set. Each sentence of
the training file must be on a line by itself. Each line consists of a token followed by the
underscore character and then the tag.
The following training data was created using the first five sentences of Chapter 5 , At A
Venture of Twenty Thousands Leagues Under the Sea . Although this is not a large sample
set, it is easy to create and adequate for illustration purposes.
It is saved in a file named sample.train :
The_DT voyage_NN of_IN the_DT Abraham_NNP Lincoln_NNP
was_VBD for_IN a_DT long_JJ time_NN marked_VBN by_IN no_DT
special_JJ incident._NN
But_CC one_CD circumstance_NN happened_VBD which_WDT
showed_VBD the_DT wonderful_JJ dexterity_NN of_IN Ned_NNP
Land,_NNP and_CC proved_VBD what_WP confidence_NN we_PRP
might_MD place_VB in_IN him._PRP$
The_DT 30th_JJ of_IN June,_NNP the_DT frigate_NN spoke_VBD
some_DT American_NNP whalers,_, from_IN whom_WP we_PRP
learned_VBD that_IN they_PRP knew_VBD nothing_NN about_IN
the_DT narwhal._NN
But_CC one_CD of_IN them,_PRP$ the_DT captain_NN of_IN
the_DT Monroe,_NNP knowing_VBG that_IN Ned_NNP Land_NNP
had_VBD shipped_VBN on_IN board_NN the_DT Abraham_NNP
Lincoln,_NNP begged_VBD for_IN his_PRP$ help_NN in_IN
chasing_VBG a_DT whale_NN they_PRP had_VBD in_IN sight._NN
We will demonstrate the creation of the model using the POSModel class' train method
and how the model can be saved to a file. We start with the declaration of the POSModel
instance variable:
POSModel model = null;
A try-with-resources block opens the sample file:
Search WWH ::

Custom Search