Java Reference
In-Depth Information
Creating a dictionary from a file
If we need to create a new dictionary, then one approach is to create an XML file contain-
ing all of the words and their tags, and then create the dictionary from the file. OpenNLP
supports this approach with the POSDictionary class' create method.
The XML file consists of the dictionary root element followed by a series of entry
elements. The entry element uses the tags attribute to specify the tags for the word.
The word is contained within the entry element as a token element. A simple ex-
ample using two words stored in the file dictionary.txt is as follows:
<dictionary case_sensitive="false">
<entry tags="JJ VB">
<token>strong</token>
</entry>
<entry tags="NN VBP VB">
<token>force</token>
</entry>
</dictionary>
To create the dictionary, we use the create method based on an input stream as shown
here:
try (InputStream dictionaryIn =
new FileInputStream(new File("dictionary.txt"));) {
POSDictionary dictionary =
POSDictionary.create(dictionaryIn);
} catch (IOException e) {
// Handle exceptions
}
The POSDictionary class has an iterator method that returns an iterator object. Its
next method returns a string for each word in the dictionary. We can use these methods
to display the contents of the dictionary, as shown here:
Iterator<String> iterator = dictionary.iterator();
while (iterator.hasNext()) {
String entry = iterator.next();
Search WWH ::




Custom Search