Java Reference
In-Depth Information
Using OpenNLP
OpenNLP uses models to perform SBD. An instance of the SentenceDetectorME
class is created, based on a model file. Sentences are returned by the sentDetect meth-
od, and position information is returned by the sentPosDetect method.
Using the SentenceDetectorME class
A model is loaded from a file using the SentenceModel class. An instance of the Sen-
tenceDetectorME class is then created using the model, and the sentDetect meth-
od is invoked to perform SDB. The method returns an array of strings, with each element
holding a sentence.
This process is demonstrated in the following example. A try-with-resources block is used
to open the en-sent.bin file, which contains a model. Then the paragraph string is
processed. Next, various IO type exceptions are caught (if necessary). Finally, a for-each
statement is used to display the sentences:
try (InputStream is = new FileInputStream(
new File(getModelDir(), "en-sent.bin"))) {
SentenceModel model = new SentenceModel(is);
SentenceDetectorME detector = new
SentenceDetectorME(model);
String sentences[] = detector.sentDetect(paragraph);
for (String sentence : sentences) {
System.out.println(sentence);
}
} catch (FileNotFoundException ex) {
// Handle exception
} catch (IOException ex) {
// Handle exception
}
On execution, we get the following output:
When determining the end of sentences we need to consider
several factors.
Sentences may end with exclamation marks!
Or possibly questions marks?
Search WWH ::




Custom Search