Java Reference
In-Depth Information
try (InputStream dataIn = new
FileInputStream("sample.train");) {
} catch (IOException e) {
// Handle excpetions
}
An instance of the PlainTextByLineStream class is created and used with the
WordTagSampleStream class to create an ObjectStream<POSSample> in-
stance. This puts the sample data into the format required by the train method:
ObjectStream<String> lineStream =
new PlainTextByLineStream(dataIn, "UTF-8");
ObjectStream<POSSample> sampleStream =
new WordTagSampleStream(lineStream);
The train method uses its parameters to specify the language, the sample stream, train-
ing parameters, and any dictionaries (none) needed, as shown here:
model = POSTaggerME.train("en", sampleStream,
TrainingParameters.defaultParams(), null, null);
The output of this process is lengthy. The following output has been shortened to conserve
space:
Indexing events using cutoff of 5
Computing event counts... done. 90 events
Indexing... done.
Sorting and merging events... done. Reduced 90 events to 82.
Done indexing.
Incorporating indexed data for training...
done.
Number of Event Tokens: 82
Number of Outcomes: 17
Number of Predicates: 45
...done.
Computing model parameters ...
Performing 100 iterations.
Search WWH ::




Custom Search