Java Reference
In-Depth Information
advance(parse, "table", tableNum);
Next we begin reading the HTML tags. We continue until the end of the file is reached.
int ch;
while ((ch = parse.read()) != -1)
{
if (ch == 0)
{
HTMLTag tag = parse.getTag();
When a
<tr>
tag is located a new table row has begun. This means that we must clear
out the last table row.
if (tag.getName().equalsIgnoreCase("tr"))
{
list.clear();
capture = false;
buffer.setLength(0);
When a
</tr>
tag is located a table row has ended. If any columns have been recorded,
then call
processTableRow
to process the row that has just ended.
} else if (tag.getName().equalsIgnoreCase("/tr"))
{
if (list.size() > 0)
{
processTableRow(list);
list.clear();
}
When a
<td>
tag is located a table column is about to begin. If there was any data already
being captured for a column then record it to the list. Set the variable named
capture
to
true
so that the text following the
<td>
tag will be captured.
} else if (tag.getName().equalsIgnoreCase("td"))
{
if (buffer.length() > 0)
list.add(buffer.toString());
buffer.setLength(0);
capture = true;
When a
</td>
tag is located, a column has just ended. This column should be recorded
to the variable
list
and capturing should stop.
} else if (tag.getName().equalsIgnoreCase("/td"))
{
list.add(buffer.toString());
buffer.setLength(0);
capture = false;