Java Reference
In-Depth Information
List<HasWord> sentence = it.next();
for (HasWord token : sentence) {
System.out.println(token);
}
}
When executed, we get the following output:
Let
's
pause
,
and
then
reflect
.
Using a pipeline
Here, we will use the StanfordCoreNLP class as demonstrated in Chapter 1 , Introduc-
tion to NLP . However, we use a simpler annotator string to tokenize the paragraph. As
shown next, a Properties object is created and assigned the annotators tokenize
and ssplit .
The tokenize annotator specifies that tokenization will occur and the ssplit annota-
tion results in sentences being split:
Properties properties = new Properties();
properties.put("annotators", "tokenize, ssplit");
The StanfordCoreNLP class and the Annotation classes are created next:
StanfordCoreNLP pipeline = new StanfordCoreNLP(properties);
Annotation annotation = new Annotation(paragraph);
The annotate method is executed to tokenize the text and then the prettyPrint
method will display the tokens:
pipeline.annotate(annotation);
pipeline.prettyPrint(annotation, System.out);
Search WWH ::




Custom Search