Finding People and Things - Natural Language Processing with Java

Java Reference

In-Depth Information

TokenNameFinderModel entityModel =

new TokenNameFinderModel(modelStream);

NameFinderME nameFinder = new NameFinderME(entityModel);

We can now use the tokenize method to tokenize the text and the find method to

identify the person in the text. The find method will use the tokenized String array as

input and return an array of Span objects, as shown:

String tokens[] = tokenizer.tokenize(sentence);

Span nameSpans[] = nameFinder.find(tokens);

We discussed the Span class in Chapter 3 , Finding Sentences . As you may remember,

this class holds positional information about the entities found. The actual string entities

are still in the tokens array:

The following for statement displays the person found in the sentence. Its positional in-

formation and the person are displayed on separate lines:

for (int i = 0; i < nameSpans.length; i++) {

System.out.println("Span: " + nameSpans[i].toString());

System.out.println("Entity: "

+ tokens[nameSpans[i].getStart()]);

}

The output is as follows:

Span: [7..9) person

Entity: Fred

We will often work with multiple sentences. To demonstrate this, we will use the previ-

ously defined sentences string array. The previous for statement is replaced with the

following sequence. The tokenize method is invoked against each sentence and then

the entity information is displayed as earlier:

for (String sentence : sentences) {

String tokens[] = tokenizer.tokenize(sentence);

Span nameSpans[] = nameFinder.find(tokens);

for (int i = 0; i < nameSpans.length; i++) {

System.out.println("Span: " +

nameSpans[i].toString());

Search WWH ::

Custom Search

Home