Java Reference
In-Depth Information
expression isn't quite right, though, because there might be many such characters
in a row. For example, there might be several spaces, dashes, or other punctuation
characters separating two words. We can indicate that a sequence of illegal characters
should also be ignored by putting a plus after the square brackets to indicate “Any
sequence of one or more of these characters”:
[ a-zA-Z']+
We pass this regular expression as a String to a call on useDelimiter . We can
add this at the beginning of the getWords method:
public static ArrayList<String> getWords(Scanner input) {
input.useDelimiter("[ a-zA-Z']+");
...
}
The following is a complete program that incorporates all of these changes and
includes more extensive commenting:
1 // This program reads two text files and compares the
2 // vocabulary used in each.
3
4 import java.util.*;
5 import java.io.*;
6
7 public class Vocabulary3 {
8 public static void main(String[] args)
9 throws FileNotFoundException {
10 Scanner console = new Scanner(System.in);
11 giveIntro();
12
13 System.out.print("file #1 name? ");
14 Scanner in1 = new Scanner( new File(console.nextLine()));
15 System.out.print("file #2 name? ");
16 Scanner in2 = new Scanner( new File(console.nextLine()));
17 System.out.println();
18
19 ArrayList<String> list1 = getWords(in1);
20 ArrayList<String> list2 = getWords(in2);
21 ArrayList<String> common = getOverlap(list1, list2);
22
23 reportResults(list1, list2, common);
24 }
25
Search WWH ::




Custom Search