Finding People and Things - Natural Language Processing with Java

Java Reference

In-Depth Information

Using OpenNLP for NER

We will demonstrate the use of the TokenNameFinderModel class to perform NLP us-

ing the OpenNLP API. Additionally, we will demonstrate how to determine the probability

that the entity identified is correct.

The general approach is to convert the text into a series of tokenized sentences, create an

instance of the TokenNameFinderModel class using an appropriate model, and then

use the find method to identify the entities in the text.

The following example demonstrates the use of the TokenNameFinderModel class. We

will use a simple sentence initially and then use multiple sentences. The sentence is defined

here:

String sentence = "He was the last person to see Fred.";

We will use the models found in the en-token.bin and en-ner-person.bin files

for the tokenizer and name finder models, respectively. The InputStream object for

these files is opened using a try-with-resources block, as shown here:

try (InputStream tokenStream = new FileInputStream(

new File(getModelDir(), "en-token.bin"));

InputStream modelStream = new FileInputStream(

new File(getModelDir(), "en-ner-person.bin"));) {

...

} catch (Exception ex) {

// Handle exceptions

}

Within the try block, the TokenizerModel and Tokenizer objects are created:

TokenizerModel tokenModel = new

TokenizerModel(tokenStream);

Tokenizer tokenizer = new TokenizerME(tokenModel);

Next, an instance of the NameFinderME class is created using the person model:

Search WWH ::

Custom Search

Home