Java Reference
In-Depth Information
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true);
builderFactory.setValidating(true);
If you add the shaded statements to the example, the
newDocumentBuilder()
method for the factory
object should now return a validating and namespace aware parser. With a validating parser, we should
define an
ErrorHandler
object that will deal with parsing errors. You identify the
ErrorHandler
object to the parser by calling the
setErrorHandler()
method for the
DocumentBuilder
object:
builder.setErrorHandler(handler);
Here
handler
refers to an object that implements the three methods declared in the
org.xml.sax.ErrorHandler
interface. We discussed these in the previous chapter in the context of
SAX parser error handling, and the same applies here. If you do create a validating parser, you should
implement and register an
ErrorHandler
object. Otherwise the parser may not work properly.
The factory object has methods to check the status of parser features corresponding to each of the
setXXX()
methods above. The checking methods all have corresponding names of the form
isXXX()
, so to check
whether a parser will be namespace aware, you call the
isNamespaceAware()
method. Each method
returns
true
if the parser to be created will have the feature set, and
false
otherwise.
Parsing a Document
Once you have created a
DocumentBuilder
object, you just call its
parse()
method with a
document source as an argument to parse a document. The
parse()
method will return a reference of
type
Document
to a object that encapsulates the entire XML document. The
Document
interface is
defined in the
org.w3c.dom
package.
There are five overloaded versions of the
parse()
method that provide various options for you to
identify the source of the XML document. They all return a reference to a
Document
object:
parse(File file)
Parses the document in the file identified by
file
.
parse(String uri)
Parses the document at the URI,
uri
.
parse(InputSource source)
Parses the document from
source
.
parse(InputStream stream)
Parses the document read from the input stream,
stream
.
parse(InputStream stream,
String systemID)
Parses the document read from the input stream,
stream
. The second argument,
systemID
, is used to
resolve relative URIs.
All five versions of the parse method can throw three types of exception. An exception of type
IllegalArgumentException
will be thrown if you pass
null
to the method for the parameter that
identifies the document source. The method will throw an exception of type
IOException
if any I/O
error occurs, and of type
SAXException
in the event of a parsing error. Both these last exceptions
must be caught. Note that it is a
SAXException
that can be thrown here. Exceptions of type
DOMException
only arise when you are navigating the tree for a
Document
object.