Detecting Part of Speech - Natural Language Processing with Java

Java Reference

In-Depth Information

cesses one Span object at a time displaying the basic span information. The inner for

loop displays the spanned text enclosed within brackets:

Span[] spans = chunkerME.chunkAsSpans(sentence, tags);

for (Span span : spans) {

System.out.print("Type: " + span.getType() + " - "

+ " Begin: " + span.getStart()

+ " End:" + span.getEnd()

+ " Length: " + span.length() + " [");

for (int j = span.getStart(); j < span.getEnd(); j++) {

System.out.print(sentence[j] + " ");

}

System.out.println("]");

}

The following output clearly shows the span type, its position in the sentence array, its

length, and then the actual spanned text:

Type: NP - Begin: 0 End:2 Length: 2 [The voyage ]

Type: PP - Begin: 2 End:3 Length: 1 [of ]

Type: NP - Begin: 3 End:6 Length: 3 [the Abraham Lincoln ]

Type: VP - Begin: 6 End:7 Length: 1 [was ]

Type: PP - Begin: 7 End:8 Length: 1 [for ]

Type: NP - Begin: 8 End:11 Length: 3 [a long time ]

Type: VP - Begin: 11 End:12 Length: 1 [marked ]

Type: PP - Begin: 12 End:13 Length: 1 [by ]

Type: NP - Begin: 13 End:16 Length: 3 [no special

incident. ]

Using the POSDictionary class

A tag dictionary specifies what are the valid tags for a word. This can prevent a tag from

being applied inappropriately to a word. In addition, some search algorithms execute

faster since they do not have to consider other less probable tags.

In this section, we will demonstrate how to:

• Obtain the tag dictionary for a tagger

• Determine what tags a word has

• Show how to change the tags for a word

Search WWH ::

Custom Search

Home