Java Reference
In-Depth Information
Parsing the Choice List
We are going to extract the state abbreviation, as well as the state name. The process
method is used to process the list. This method begins by defining several variables that will
be needed to parse the choice list. An InputStream is opened to the URL that is being
parsed, and a new ParseHTML object is constructed.
String value = "";
InputStream is = url.openStream();
ParseHTML parse = new ParseHTML(is);
StringBuilder buffer = new StringBuilder();
There may be more than one choice list on the page that we are parsing. Each choice
list will be surrounded by a beginning <select> tag, and an ending </select> tag.
If there is more than one <select> list, then we must advance to the correct one. This is
what the advance function does.
The advance function takes three parameters. The first is the parse object that is be-
ing used to parse the HTML. This object will be advanced to the correct location. The second
parameter is the name of the tag that we are advancing to. In this case we are advancing to
a “select” tag. Finally, the third parameter tells the advance function which instance of
the second parameter to look for. Zero specifies the first instance; one specifies the second
instance, and so on.
advance(parse, "select", optionList);
Once we have advanced to the correct location it is time to begin parsing for <option>
tags. We begin with a while loop that begins reading data from the parse object. As soon
as the read function returns a zero, we know that we have found an HTML tag.
int ch;
while ((ch = parse.read()) != -1)
{
if (ch == 0)
{
HTMLTag tag = parse.getTag();
First, we check to see if it is an opening <option> tag. If it is, then we read the
value attribute. This attribute will hold the abbreviation for that state.
if (tag.getName().equalsIgnoreCase("option"))
{
value = tag.getAttributeValue("value");
buffer.setLength(0);
Next we check to see if the tag encountered is an ending </option> tag. If it is, then
we have found one state. The processOption method is called to display that state as
part of the comma separated list, which is the output from this recipe.
} else if (tag.getName().equalsIgnoreCase("/option"))
Search WWH ::




Custom Search