Java Reference
In-Depth Information
Once we have accumulated a complete sentence, which ends in a period, we check the
sentence for the birth year. This is not the most accurate way to break up sentences, but it
works well enough for this recipe. If a few sentences run on, or are cut short, it really does
not impact the final output of the program. This program is all about finding many birth years
and then using a sort of “majority rules” approach to determining the correct one. If a few are
lost in the noise, it does not hurt.
If a valid birth year is found, it is recorded and the program continues.
String str = sentence.toString();
int year = extractBirth(str);
if ((year > 1) && (year < 3000))
{
System.out.println("URL supports year: " + year);
increaseYear(year);
}
sentence.setLength(0);
} else
sentence.append((char) ch);
}
} while (ch != -1);
This process is continued until the end of the HTML document is reached.
Extracting a Birth Year
Each “sentence” that is found must be scanned for a birth year. To do this, the sentence
is broken up into “words”, which are defined as groups of characters separated by spaces.
boolean foundBorn = false;
int result = -1;
StringTokenizer tok = new StringTokenizer(sentence);
while (tok.hasMoreTokens())
{
Each word must first be checked to see if it is a number. If it is a number, that number is
recorded and the program sentence parsing continues. If more than one number is found in
a sentence, only the last number is used.
String word = tok.nextToken();
try
{
result = Integer.parseInt(word);
} catch (NumberFormatException e)
{
Search WWH ::




Custom Search