Java Reference
In-Depth Information
First Part
NP
Noun chunk
VB
Verb chunk
Multiple words are grouped together such as "The voyage" and "the Abraham Lincoln".
[The] B-NP
[voyage] I-NP
[of] B-PP
[the] B-NP
[Abraham] I-NP
[Lincoln] I-NP
[was] B-VP
[for] B-PP
[a] B-NP
[long] I-NP
[time] I-NP
[marked] B-VP
[by] B-PP
[no] B-NP
[special] I-NP
[incident.] I-NP
If we are interested in getting more detailed information about the chunks, we can use the
ChunkerME
class'
chunkAsSpans
method. This method returns an array of
Span
ob-
jects. Each object represents one span found in the text.
There are several other
ChunkerME
class methods available. Here, we will illustrate the
use of the
getType
,
getStart
, and
getEnd
methods. The
getType
method returns
the second part of the chunk tag, and the
getStart
and
getEnd
methods return the be-
ginning and ending index of the tokens, respectively, in the original
sentence
array.
The
length
method returns the length of the span in number of tokens.
In the following sequence, the
chunkAsSpans
method is executed using the
sen-
tence
and
tags
arrays. The
spans
array is then displayed. The outer for loop pro-