Java Reference
In-Depth Information
Searching for the answer
Once we know the type of question, we can use the relations found in the text to answer the
question. To illustrate this process, we will develop the
processWhoQuestion
method.
This method uses the
TypedDependency
list to garner the information needed to answer
a "who" type question about presidents. Specifically, we need to know which president they
are interested in, based on the president's ordinal rank.
We will also need a list of presidents to search for relevant information. The
cre-
atePresidentList
method was developed to perform this task. It reads a file,
Pres-
identList
, containing the president's name, inauguration year, and last year in office.
The file uses the following format and can be downloaded from
www.packtpub.com
:
George Washington (1789-1797)
The following
createPresidentList
method demonstrates the use of OpenNLP's
SimpleTokenizer
class to tokenize each line. A variable number of tokens make up a
president's name. Once that is determined, the dates are easily extracted:
public List<President> createPresidentList() {
ArrayList<President> list = new ArrayList<>();
String line = null;
try (FileReader reader = new FileReader("PresidentList");
BufferedReader br = new BufferedReader(reader)) {
while ((line = br.readLine()) != null) {
SimpleTokenizer simpleTokenizer =
SimpleTokenizer.INSTANCE;
String tokens[] = simpleTokenizer.tokenize(line);
String name = "";
String start = "";
String end = "";
int i = 0;
while (!"(".equals(tokens[i])) {
name += tokens[i] + " ";
i++;
}
start = tokens[i + 1];
end = tokens[i + 3];