Java Reference
In-Depth Information
Setting Parser Properties
As I said at the outset, a property is a parser parameter with a value that is an object, usually a
String
object.
Some properties have values that you set to influence the parser's operation, but the values for other proper-
ties are set by the parser for you to retrieve to provide information about the parsing process.
You can set the properties for a parser by calling the
setProperty()
method for the
SAXParser
object
after you have created it. The first argument to the method is the name of the property as type
String
, and
the second argument is the value for the property. A property value can be of any class type, as the paramet-
er type is
Object
, but it is usually of type
String.
The
setProperty()
method throws a
SAXNotRecog-
nizedException
if the property name is not recognized or a
SAXNotSupportedException
if the property
name is recognized but not supported. Both of these exception classes are defined in the
org.xml.sax
pack-
age. Alternatively, you can get and set properties using the
XMLReader
object reference that you used to set
features. The
XMLReader
interface declares the
getProperty()
and
setProperty()
methods with the same
signatures as those for the
SAXParser
object.
You can also retrieve the values for some properties during parsing to obtain additional information about
the most recent parsing event. You use the parser's
getProperty()
method in this case. The argument to
the method is the name of the property, and the method returns a reference to the property's value.
As with features, there is no defined set of parser properties, so you need to consult the parser docu-
mentation for information on these. There are four standard properties for a SAX parser, none of which are
required to be supported by a SAX parser. Because these properties involve the more advanced features of
SAX parser operation, they are beyond the scope of this topic, but if you are interested, they are documented
in the description for the
org.xml.sax
package that you can find in the JDK documentation.
Parsing Documents with SAX
To parse a document using the
SAXParser
object you simply call its
parse()
method. You have to
supply two arguments: the first identifies the XML document, and the second is a reference of type
org.xml.sax.helpers.DefaultHandler
to a handler object that you have created to process the contents
of the document. The
DefaultHandler
object must contain a specific set of public methods that the
SAXParser
object expects to be able to call for each event, where each type of event corresponds to a par-
ticular syntactic element it finds in the document.
The
DefaultHandler
class already contains do-nothing definitions for subsets of all the callback meth-
ods that a
SAXParser
object supporting SAX2.0 expects to be able to call. Thus, all you have to do is
define a class that extends the
DefaultHandler
class and override the methods in the base class for the
events that you are interested in. There is the
org.xml.sax.ext.DefaultHandler2
class that extends the
DefaultHandler
class. This adds methods to support extensions to SAX2, but I won't be going into these.
Let's not gallop too far ahead, though. You need to look into the versions of the
parse()
method that you
have available before you can get into handling parsing events. The
SAXParser
class defines ten overloaded
versions of the
parse()
method, but you are interested in only five of them. The other five use a deprecated
handler type
HandlerBase
that was applicable to SAX1, so you can ignore those and just look at the ver-
sions that relate to SAX2. All versions of the method have a return type of
void
, and the five varieties of the
parse()
method that you consider are as follows:
•
parse(File file, DefaultHandler handler)
: Parses the document in the file specified by
file
using
handler
as the object containing the callback methods called by the parser. This