Java Reference
In-Depth Information
// Parse from the URL.
advance(parse, listType, 0);
Now, the HTML will be parsed. Begin looping through, reading each character. When an
HTML tag is located, examine it to determine what it is.
int ch;
while ((ch = parse.read()) != -1)
{
if (ch == 0)
{
HTMLTag tag = parse.getTag();
If the tag is an <li> tag, then we have found one of the result items. If there was already
data in the buffer, then process it as a valid state or capital.
if (tag.getName().equalsIgnoreCase("li"))
{
if (buffer.length() > 0)
processItem(buffer.toString());
buffer.setLength(0);
capture = true;
Many web sites do not include ending </li> items. However, if they are present, then
stop capturing text. Process any text already captured as a valid state or capital.
} else if (tag.getName().equalsIgnoreCase("/li"))
{
processItem(buffer.toString());
buffer.setLength(0);
capture = false;
If the end of the list has been found, then stop processing states and capitals.
} else if (tag.getName().equalsIgnoreCase(listTypeEnd))
{
processItem(buffer.toString());
break;
}
If the character found was a regular character, and not a tag, then append it to the
StringBuilder .
} else
{
if (capture)
buffer.append((char) ch);
}
This recipe shows how to access data through an HTTP GET . Many web sites make use
of HTTP GET . In fact, most search engines use HTTP GET from their main page.
Search WWH ::




Custom Search