Java Reference
In-Depth Information
StopWords stopWords = new
StopWords("stop-words_english_2_en.txt");
for (int i = 0; i < sentences.length; i++) {
sentences[i] = stopWords.removeStopWords(sentences[i]);
}
The text has now been processed. The next step will be to create an index-like data struc-
ture based on the processed text. This structure will use the Word and Positions class.
The Word class consists of fields for the word and an ArrayList of Positions ob-
jects. Since a word may appear more than once in a document, the list is used to maintain
its position within the document. This class is defined as shown here:
public class Word {
private String word;
private final ArrayList<Positions> positions;
public Word() {
this.positions = new ArrayList();
}
public void addWord(String word, int sentence,
int position) {
this.word = word;
Positions counts = new Positions(sentence,
position);
positions.add(counts);
}
public ArrayList<Positions> getPositions() {
return positions;
}
public String getWord() {
return word;
}
}
The Positions class contains a field for the sentence number, sentence , and for the
position of the word within the sentence, position . The class definition is as follows:
Search WWH ::




Custom Search