Java Reference
In-Depth Information
Specific parsers, such as the Xerces parser that you get with the JDK, define their own features and prop-
erties that control and report on the processing of XML documents. A
feature
is an option in processing
XML that is either on or off, so a feature is set as a
boolean
value, either
true
or
false
. A
property
is a
parameter that you set to a particular object value, usually of type
String
. There are standard SAX2 fea-
tures and properties that may be common to several parsers, and non-standard features and properties that
are parser-specific. Note that although a feature or property may be standard for SAX2, this does not mean
that a SAX2 parser necessarily supports it.
Querying and Setting Parser Features
Namespace awareness and validating capability are both features of a parser, and you already know how
you tell the parser factory object that you want a parser with these features turned on. In general, each parser
feature is identified by a name that is a fully qualified URI, and the standard features for SAX2 parsing have
names within the namespace
http://xml.org/sax/features/
.
For example, the feature specifying namespace
awareness has the name
http://xml.org/sax/features/namespaces
.
Here are a few of the standard features that
are defined for SAX2 parsers:
•
namespaces
: When
true
, the parser replaces prefixes to element and attribute names with the cor-
responding namespace URIs. If you set this feature to
true
, the document must have a schema
that supports the use of namespaces. All SAX parsers must support this feature.
•
namespace-prefixes
: When
true
, the parser reports the original prefixed names and attributes
used for namespace declarations. The default value is
false
. All SAX parsers must support this
feature.
•
validation
: When
true
, the parser validates the document and reports any errors. The default
value is
false
.
•
external-general-entities
: When
true
, the parser includes general entities.
•
string-interning
: When
true
, all element and attribute names, namespace URIs, and local
names use Java string interning so each of these corresponds to a unique object. This feature is
always
true
for the Xerces parser.
•
external-parameter-entities
: When
true
, the parser includes external parameter entities and
the external DTD subset.
•
lexical-handler/parameter-entities
: When
true
, the beginning and end of parameter entit-
ies will be reported.
You can find a more comprehensive list in the description for the
org.xml.sax
package that is in the
JDK documentation. There are other non-standard features for the Xerces parser. Consult the documentation
for the parser on the Apache website for more details. Apart from the
namespaces
and
namespaces-pre-
fixes
features that all SAX2 parsers are required to implement, there is no set collection of features for a
SAX2 parser, so a parser may implement any number of arbitrary features that may or may not be in the list
of standard features.
You have two ways to query and set features for a parser. You can call the
getFeature()
and
setFeature()
methods for the
SAXParserFactory
object to do this before you create the
SAXParser
ob-
ject. The parser that is created then has the features switched on. Alternatively, you can create a
SAXParser
object using the factory object and then obtain an
org.sax.XMLReader
object reference from it by calling
the
getXMLReader()
method. You can then call the
getFeature()
and
setFeature()
methods for the
XMLReader
object.
XMLReader
is the interface that a concrete SAX2 parser implements to allow features and
properties to be set and queried. The principle difference in use between calling the factory object methods