Java Reference
In-Depth Information
HTMLTag tag = parse.getTag();
if (tag.getName().equalsIgnoreCase("a"))
{
value = tag.getAttributeValue("href");
URL u = new URL(url, value.toString());
value = u.toString();
buffer.setLength(0);
When the </a> tag is found, the tag's text and href value are both displayed.
} else if (tag.getName().equalsIgnoreCase("/a"))
{
processOption(buffer.toString(), value);
}
If we found a regular character, and not an HTML tag, then add it to the buffer .
} else
{
buffer.append((char) ch);
}
This loop continues until all links in the file have been processed.
Recipe #6.5: Extracting Images from HTML
Images are very common on web sites. We have already seen how an image can be down-
loaded as a binary file. We can also create a bot that examines the <img> tags on a site and
then downloads the images that it finds. This recipe will extract all of the images from the
following URL.
http://www.httprecipes.com/1/6/image.php
You can see this image list in Figure 6.5.
Search WWH ::




Custom Search