Java Reference
In-Depth Information
When a </table> tag is located the table has ended. Parsing is now done.
} else if (tag.getName().equalsIgnoreCase("/table"))
{
break;
}
If we found a regular character, and not an HTML tag, then add it to the buffer , if we
are currently capturing characters.
} else
{
if (capture)
buffer.append((char) ch);
}
The loop will continue until all cells of the table have been processed.
Parsing a Table Row
For each row of data that is recorded the processRow method is called. This
method simply prints out the data in a comma-delineated format. The first thing that the
processRow method does is to create a StringBuilder and begin iterating over
the columns sent to it in the list variable.
StringBuilder result = new StringBuilder();
for (String item : list)
{
For each column recorded add it to the StringBuilder . Make sure each column
is enclosed in quotes.
if (result.length() > 0)
result.append(",");
result.append('\"');
result.append(item);
result.append('\"');
}
Finally, display the complete row.
System.out.println(result.toString());
This method is called for all rows in the table.
Recipe #6.4: Extracting Data from Hyperlinks
Hyperlinks are very common on web sites. Hyperlinks are what hold the web together.
This recipe will extract the hyperlinks from the following URL:
Search WWH ::




Custom Search