Java Reference
In-Depth Information
class Positions {
int sentence;
int position;
Positions(int sentence, int position) {
this.sentence = sentence;
this.position = position;
}
}
To use these classes, we create a
HashMap
instance to hold position information about
each word in the file:
HashMap<String, Word> wordMap = new HashMap();
The creation of the
Word
entries in the map is shown next. Each sentence is tokenized
and then each token is checked to see if it exists in the map. The word is used as the key
to the hash map.
The
containsKey
method determines whether the word has already been added. If it
has, then the
Word
instance is removed. If the word has not been added before, a new
Word
instance is created. Regardless, the new position information is added to the
Word
instance and then it is added to the map:
for (int sentenceIndex = 0;
sentenceIndex < sentences.length; sentenceIndex++) {
String words[] = WhitespaceTokenizer.INSTANCE.tokenize(
sentences[sentenceIndex]);
Word word;
for (int wordIndex = 0;
wordIndex < words.length; wordIndex++) {
String newWord = words[wordIndex];
if (wordMap.containsKey(newWord)) {
word = wordMap.remove(newWord);
} else {
word = new Word();
}
word.addWord(newWord, sentenceIndex, wordIndex);
wordMap.put(newWord, word);