Java Reference
In-Depth Information
Now the train method can be used like this:
SentenceModel model = SentenceDetectorME.train("en",
sampleStream, true,
null, TrainingParameters.defaultParams());
The output of the method is a trained model. The parameters of this method are detailed in
the following table:
Parameter
Meaning
Specifies that the language of the text is English
"en"
The training text stream
sampleStream
Specifies whether end tokens shown should be used
true
A dictionary for abbreviations
null
TrainingParameters.defaultParams() Specifies that the default training parameters should be used
In the following sequence, an OutputStream is created and used to save the model in
the modelFile file. This allows the model to be reused for other applications:
OutputStream modelStream = new BufferedOutputStream(
new FileOutputStream("modelFile"));
model.serialize(modelStream);
The output of this process is as follows. All the iterations have not been shown here to
save space. The default cuts off indexing events to 5 and iterations to 100:
Indexing events using cutoff of 5
Computing event counts... done. 93 events
Indexing... done.
Sorting and merging events... done. Reduced 93 events to 63.
Done indexing.
Incorporating indexed data for training...
done.
Search WWH ::




Custom Search