Java Reference
In-Depth Information
private static HashSet stopWords = new HashSet();
...
}
Two constructors of the class follow which populate the
HashSet
:
public StopWords() {
stopWords.addAll(Arrays.asList(defaultStopWords));
}
public StopWords(String fileName) {
try {
BufferedReader bufferedreader =
new BufferedReader(new
FileReader(fileName));
while (bufferedreader.ready()) {
stopWords.add(bufferedreader.readLine());
}
} catch (IOException ex) {
ex.printStackTrace();
}
}
The convenience method
addStopWord
allows additional words to be added:
public void addStopWord(String word) {
stopWords.add(word);
}
The
removeStopWords
method is used to remove the stopwords. It creates an
Ar-
rayList
to hold the original words passed to the method. The for loop is used to remove
stopwords from this list. The
contains
method will determine if the word submitted is
a stopword, and if so, remove it. The
ArrayList
is converted to an array of strings and
then returned. This is shown as follows:
public String[] removeStopWords(String[] words) {
ArrayList<String> tokens =
new ArrayList<String>(Arrays.asList(words));
for (int i = 0; i < tokens.size(); i++) {