Using Parser to Extract Relationships - Natural Language Processing with Java

Java Reference

In-Depth Information

Using the Stanford API

There are several approaches to parsing available in the Stanford NLP API. First, we will

demonstrate a general purposes parser, the LexicalizedParser class. Then, we will

illustrate how the result of the parser can be displayed using the TreePrint class. This

will be followed by a demonstration of how to determine word dependencies using the

GrammaticalStructure class.

Using the LexicalizedParser class

The LexicalizedParser class is a lexicalized PCFG parser. It can use various models

to perform the parsing process. The apply method is used with a List instance of the

CoreLabel objects to create a parse tree.

In the following code sequence, the parser is instantiated using the eng-

lishPCFG.ser.gz model:

String parserModel = ".../models/lexparser/

englishPCFG.ser.gz";

LexicalizedParser lexicalizedParser =

LexicalizedParser.loadModel(parserModel);

The list instance of the CoreLabel objects is created using the Sentence class'

toCoreLabelList method. The CoreLabel objects contain a word and other inform-

ation. There are no tags or labels for these words. The words in the array have been effect-

ively tokenized.

String[] senetenceArray = {"The", "cow", "jumped", "over",

"the", "moon", "."};

List<CoreLabel> words =

Sentence.toCoreLabelList(senetenceArray);

The apply method can now be invoked:

Tree parseTree = lexicalizedParser.apply(words);

One simple approach to display the result of the parse is to use the pennPrint method,

which displays the parse tree in the same way as the Penn TreeBank does

Search WWH ::

Custom Search

Home