Java Reference
In-Depth Information
Version 1: Compute Vocabulary
The program we are writing will only be interesting when we compare large input files,
but while we are developing the program it will be easier to use short input files so we
can easily check whether we are getting the right answer. Using short input files also
means that we don't have to worry about execution time. When you use a large input
file and the program takes a long time to execute, it is difficult to know whether the
program will ever finish executing. If we develop the program with the use of short
input files, we'll know that it should never take a long time to execute. So if we acci-
dentally introduce an infinite loop into our program, we'll know right away that the
problem has to do with our code, not with the fact that we have a lot of data to process.
We'll use the first two stanzas of a popular children's song as our input files. We'll
create a file called test1.txt that contains the following text:
The wheels on the bus go Round and round
Round and round
Round and round.
The wheels on the bus go Round and round
All through the town.
We'll also use a file called test2.txt that contains the following text:
The wipers on the bus go Swish, swish, swish,
Swish, swish, swish,
Swish, swish, swish.
The wipers on the bus go Swish, swish, swish,
All through the town.
We need to open each of these files with a Scanner , so our main method will
begin with the following lines of code:
Scanner in1 = new Scanner(new File("test1.txt"));
Scanner in2 = new Scanner(new File("test2.txt"));
Then we want to compute the unique vocabulary contained in each file. We can
store this list of words in an ArrayList<String> . The operation will be the same
for each file, so it makes sense to write a single method that we call twice. The
method should take the Scanner as a parameter and it should convert it into an
ArrayList<String> that contains the vocabulary. So, after opening the files, we can
execute the following code:
ArrayList<String> list1 = getWords(in1);
ArrayList<String> list2 = getWords(in2);
This initial version is meant to be fairly simple, so after we have computed the
vocabulary for each file, we can simply report it:
System.out.println("list1 = " + list1);
System.out.println("list2 = " + list2);
 
Search WWH ::




Custom Search