Detecting Part of Speech - Natural Language Processing with Java

Java Reference

In-Depth Information

These will tokenize, split the text into sentences, and then find the POS tags:

Properties props = new Properties();

props.put("annotators", "tokenize, ssplit, pos");

StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

To process the text, we will use the theSentence variable as input to Annotator .

The pipeline's annotate method is then invoked as shown here:

Annotation document = new Annotation(theSentence);

pipeline.annotate(document);

Since the pipeline can perform different types of processing, a list of CoreMap objects is

used to access the words and tags. The Annotation class' get method returns the list

of sentences, as shown here.

List<CoreMap> sentences =

document.get(SentencesAnnotation.class);

The contents of the CoreMap objects can be accessed using its get method. The meth-

od's argument is the class for the information needed. As shown in the following code ex-

ample, tokens are accessed using the TextAnnotation class, and the POS tags can be

retrieved using the PartOfSpeechAnnotation class. Each word of each sentence

and its tags is displayed:

for (CoreMap sentence : sentences) {

for (CoreLabel token :

sentence.get(TokensAnnotation.class)) {

String word = token.get(TextAnnotation.class);

String pos =

token.get(PartOfSpeechAnnotation.class);

System.out.print(word + "/" + pos + " ");

}

System.out.println();

}

The output will be as follows:

Search WWH ::

Custom Search

Home