Java Reference
In-Depth Information
The output from the FormUtility object is posted to the form. A ParseHTML
object is setup to parse the search results.
// Perform the post.
os.write(bos.toByteArray());
// Read the results.
InputStream is = http.getInputStream();
ParseHTML parse = new ParseHTML(is);
Now the HTML will be parsed. Begin looping through, reading each character. When an
HTML tag is located, examine that HTML tag to see what it is.
advance(parse, listType, 0);
int ch;
while ((ch = parse.read()) != -1)
{
if (ch == 0)
{
HTMLTag tag = parse.getTag();
If the tag is an <li> tag, then we have found one of the result items. If there was already
data in the buffer, then process it as a valid state or capital.
if (tag.getName().equalsIgnoreCase("li"))
{
if (buffer.length() > 0)
result.add(buffer.toString());
buffer.setLength(0);
capture = true;
Many web sites do not include ending </li> items; however, if they are present, then
stop capturing text. Process any already captured text as a valid state or capitol.
} else if (tag.getName().equalsIgnoreCase("/li"))
{
result.add(buffer.toString());
buffer.setLength(0);
capture = false;
If we have reached the end of the list, then there is no more data to parse.
} else if (tag.getName().equalsIgnoreCase(listTypeEnd))
{
result.add(buffer.toString());
break;
}
Search WWH ::




Custom Search