Java Reference
In-Depth Information
Next we extract the state's capital, name, flag and official site.
String capital = extractNoCase(str,
"Capital:<b></td><td>", "</td>", 0);
String name = extractNoCase(str, "<h1>", "</h1>", 0);
String flag = extractNoCase(str,
"<img src=\"", "\" border=\"1\">", 2);
String site = extractNoCase(str,
"Official Site:<b></td><td><a href=\"", "\"", 0);
The flag is a URL, so we use the URL class to obtain a fully qualified URL to the state
flag.
URL flagURL = new URL(u, flag);
Next store the state's information to a StringBuilder as a comma delineated
line.
StringBuilder buffer = new StringBuilder();
buffer.append("\"");
buffer.append(code);
buffer.append("\",\"");
buffer.append(name);
buffer.append("\",\"");
buffer.append(capital);
buffer.append("\",\"");
buffer.append(flagURL.toString());
buffer.append("\",\"");
buffer.append(site);
buffer.append("\"");
System.out.println(buffer.toString());
This method will be called for every sub-page on the system.
Recipe #6.7: Extracting from Partial-Pages
Many web sites make use of partial pages. A partial page is when you are presented with
a list of data. However, you do not see all of your data at once. You are also given options to
move forward and backward through a large list of data. Search engine results are a perfect
example of this. You can see such a page here:
http://www.httprecipes.com/1/6/partial.php
You can see this list in Figure 6.8.
Search WWH ::




Custom Search