Java Reference
In-Depth Information
[is]
[done]
...
[who]
[knows]
[.]
Using the StanfordCoreNLP class
The StanfordCoreNLP class supports sentence detection using the ssplit annotator.
In the following example, the tokenize and ssplit annotators are used. A pipeline
object is created and the annotate method is applied against the pipeline using the
paragraph as its argument:
Properties properties = new Properties();
properties.put("annotators", "tokenize, ssplit");
StanfordCoreNLP pipeline = new StanfordCoreNLP(properties);
Annotation annotation = new Annotation(paragraph);
pipeline.annotate(annotation);
The output contains a lot of information. Only the output for the first line is shown here:
Sentence #1 (13 tokens):
When determining the end of sentences we need to consider
several factors.
[Text=When CharacterOffsetBegin=0 CharacterOffsetEnd=4]
[Text=determining CharacterOffsetBegin=5
CharacterOffsetEnd=16] [Text=the CharacterOffsetBegin=17
CharacterOffsetEnd=20] [Text=end CharacterOffsetBegin=21
CharacterOffsetEnd=24] [Text=of CharacterOffsetBegin=25
CharacterOffsetEnd=27] [Text=sentences
CharacterOffsetBegin=28 CharacterOffsetEnd=37] [Text=we
CharacterOffsetBegin=38 CharacterOffsetEnd=40] [Text=need
CharacterOffsetBegin=41 CharacterOffsetEnd=45] [Text=to
CharacterOffsetBegin=46 CharacterOffsetEnd=48]
[Text=consider CharacterOffsetBegin=49
CharacterOffsetEnd=57] [Text=several
CharacterOffsetBegin=58 CharacterOffsetEnd=65]
[Text=factors CharacterOffsetBegin=66
Search WWH ::




Custom Search