Finding People and Things - Natural Language Processing with Java - page 163

Java Reference

In-Depth Information

can be trained. This conversion is performed by the NameSampleDataStream class as

shown here. A NameSample object holds the names of the entities found in the text:

ObjectStream<NameSample> sampleStream =

new NameSampleDataStream(lineStream);

The train method can now be executed as follows:

TokenNameFinderModel model = NameFinderME.train(

"en", "person", sampleStream,

Collections.<String, Object>emptyMap(), 100, 5);

The arguments of the method are as detailed in the following table:

Parameter

Meaning

Language Code

"en"

Entity type

"person"

sampleStream Sample data

Resources

null

The number of iterations

100

The cutoff

5

The model is then serialized to an output file:

model.serialize(modelOutputStream);

The output of this sequence is as follows. It has been shortened to conserve space. Basic

information about the model creation is detailed:

Indexing events using cutoff of 5

Computing event counts... done. 53 events

Indexing... done.

Next Page

Natural Language Processing with Java

Search WWH ::

Custom Search

Home