Java Reference
In-Depth Information
This method begins by opening an InputStream to the URL that contains the table.
A ParseHTML object is created to parse this InputStream . The method then loops
over all of the text and tags in the HTML file.
InputStream is = url.openStream();
ParseHTML parse = new ParseHTML(is);
boolean first = true;
int ch;
while ((ch = parse.read()) != -1)
{
if (ch == 0)
{
HTMLTag tag = parse.getTag();
if (tag.getName().equalsIgnoreCase("a"))
{
When an <a> tag is encountered, the URL of the image is recorded.
buffer.setLength(0);
value = tag.getAttributeValue("href");
URL u = new URL(url, value.toString());
value = u.toString();
src = null;
If an <img> tag is encountered, the src attribute is recorded.
} else if (tag.getName().equalsIgnoreCase("img"))
{
src = tag.getAttributeValue("src");
When an ending </a> tag is found we need to check the text of the link. If the text of
the link was “[Next 5]” then we've found our link to the next page.
} else if (tag.getName().equalsIgnoreCase("/a"))
{
if (buffer.toString().equalsIgnoreCase("[Next 5]"))
{
If the link to the next page has been found, record it so we can return it when this method
is done.
result = new URL(url, value);
} else if (src != null)
{
If this is not the first link on the page, display the link and flag URL found. We do not
process the first link on the page because it is not related to a state. It is the link to the homep-
age.
Search WWH ::




Custom Search