Finding Parts of Text - Natural Language Processing with Java

Java Reference

In-Depth Information

String paragraph = "Similar to stemming is Lemmatization. "

+"This is the process of finding its lemma, its form "

+

+"as found in a dictionary.";

Annotation document = new Annotation(paragraph);

pipeline.annotate(document);

We now need to iterate over the sentences and tokens of the sentences. The Annota-

tion and CoreMap class' get methods will return values of the type specified. If there

are no values of the specified type, it will return null . We will use these classes to obtain

a list of lemmas.

First, a list of sentences is returned and then each word of each sentence is processed to

find lemmas. The list of sentences and lemmas are declared here:

List<CoreMap> sentences =

document.get(SentencesAnnotation.class);

List<String> lemmas = new LinkedList<>();

Two for-each statements iterate over the sentences to populate the lemmas list. Once this

is completed, the list is displayed:

for (CoreMap sentence : sentences) {

for (CoreLabelword :

sentence.get(TokensAnnotation.class)) {

lemmas.add(word.get(LemmaAnnotation.class));

}

System.out.print("[");

for (String element : lemmas) {

System.out.print(element + " ");

}

System.out.println("]");

The output of this sequence is as follows:

[similar to stem be lemmatization . this be the process of

find its lemma , its form as find in a dictionary . ]

Search WWH ::

Custom Search

Home