Advanced Text Capabilities - The Definitive Guide to Java Swing

Java Reference

In-Depth Information

■

Note Only those HTML tag constants that have been previously flagged as a block tag—where the

isBlock() method for the tag returns true —will work with the HTMLDocument.Iterator . For instance,

STRONG is not a block tag, while H1 is.

After you have the specific iterator to work with, you can look at the specific attributes and

content of each instance of the tag through the help of the class properties shown in Table 16-8.

Table 16-8. HTMLDocument.Iterator Properties

Property Name

Data Type

Access

attributes

AttributeSet

Read-only

endOffset

int

Read-only

startOffset

int

Read-only

tag

HTML.Tag

Read-only

valid

boolean

Read-only

The other piece of the iteration process is the next() method, which lets you get the next

instance of the tag in the document. The basic structure of using this iterator is as follows:

// Get the iterator

HTMLDocument.Iterator iterator = htmlDoc.getIterator(HTML.Tag.A);

// For each valid one

while (iterator.isValid()) {

// Process element

// Get the next one

iterator.next();

}

This can also be expressed in a basic for loop construct:

for (HTMLDocument.Iterator iterator = htmlDoc.getIterator(HTML.Tag.A);

iterator.isValid();

iterator.next()) {

// Process element

}

Listing 16-6 demonstrates the use of HTMLDocument.Iterator . This program prompts you

for a URL from the command line, loads the file synchronously, looks for all the <A> tags, and

then displays all the anchors listed as HREF attributes. Think of this as a simple “spidering”

application in which you can build up a database of URL links between documents. The start

The Definitive Guide to Java Swing

Search WWH ::

Custom Search

Home