Java Reference
In-Depth Information
Using a Different Parser
You might like to try a different parser at this point. There are SAX parsers available from a number of
sources but the Xerces parser produced by the XML Apache Project is easy and free to obtain, and a
snap to set up. As well as supporting SAX version 2 (SAX2), it also supports DOM level 2 (DOM2).
You can download the latest version, currently Xerces 2, from the download page on their web site at
http://xml.apache.org
. The binaries are distributed in a
.zip
file that you can unzip to a suitable
location on your hard drive - the archive unzips to create its own directory structure. You will find
everything you need in there, including documentation.
The simplest way to try out an alternative parser without making it a permanent selection over the
default is to include the path to the
.jar
file that contains the parser implementation in
the
-classpath
option on the command line. For instance, if you have downloaded the Xerces 2
parser from the Apache web site and extracted the file from the zip directly to your
C:\
drive, you can
run the example with the Xerces parser like this:
java -classpath .;C:\xerces-2_0_0\xercesImpl.jar -enableassertions TrySAX
Don't forget the period in the classpath definition that specifies the current directory. Without it the
TrySAX.class
file will not be found. If you omit the
-classpath
option, the program will revert to
using the default parser. Of course, you can use this technique to select a particular parser when you
have several installed on your PC. Just add the path to the directory that contains the JAR for the parser
to the classpath.
If you want to make the choice of the Xerces parser more permanent, you can copy the
xercesImpl.jar
file to the
ext
directory for the JRE. This will be the
jdk1.4\jre\lib\ext
directory. A JAR containing a
parser in the
ext
directory will always be found before the default parser.
Updating the Default Parser
The Crimson parser that comes with the JDK is developed independently by the Apache Project so it is
quite possible there could be newer releases of this that you might want to instal if only to fix any bugs
that might have appeared. You can override the default parser by placing a .
jar
archive containing the
Crimson release you want to use in the directory
jdk1.4\jre\lib\endorsed
. Indeed you can use
this directory to override any of the externally developed packages that are distributed with the SDK.
Parser Features and Properties
Specific parsers such as Xerces define their own features and properties that control and report on the
processing of XML documents. A
feature
is an option that is either on or off, so it is set as a
boolean
value, either
true
or
false
. Namespace awareness and validating capability are both features of a
parser. A
property
is an option with a value that is an object, usually a
String
object. Some properties
have values that you set to influence the parser's operation while the values for other properties are set
by the parser for you to retrieve to provide information about the parsing process.
You will find details of the features and properties supported by the Xerces 2 parser in the documentation
that appears in the
/doc
directory that was created when you unzipped the Xerces archive.