Java Reference
In-Depth Information
The/DT voyage/NN of/IN the/DT Abraham/NNP Lincoln/NNP was/
VBD for/IN a/DT long/JJ time/NN marked/VBN by/IN no/DT
special/JJ incident/NN ./.
The pipeline can use additional options to control how the tagger works. For example, by
default the english-left3words-distsim.tagger tagger model is used. We can
specify a different model using the pos.model property, as shown here. There is also a
pos.maxlen property to control the maximum sentence size:
props.put("pos.model",
"C:/.../Models/english-caseless-left3words-distsim.tagger");
Sometimes it is useful to have a tagged document that is XML formatted. The Stan-
fordCoreNLP class' xmlPrint method will write out such a document. The method's
first argument is the annotator to be displayed. Its second argument is the Out-
putStream object to write to. In the following code sequence, the previous tagging res-
ults are written to standard output. It is enclosed in a try-catch block to handle IO excep-
tions:
try {
pipeline.xmlPrint(document, System.out);
} catch (IOException ex) {
// Handle exceptions
}
A partial listing of the results is as follows. Only the first two words and the last word are
displayed. Each token tag contains the word, its position, and its POS tag:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="CoreNLP-to-HTML.xsl" type="text/
xsl"?>
<root>
<document>
<sentences>
<sentence id="1">
<tokens>
<token id="1">
<word>The</word>
<CharacterOffsetBegin>0</CharacterOffsetBegin>
Search WWH ::




Custom Search