Java Reference
In-Depth Information
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true);
builderFactory.setValidating(true);
If you add the shaded statements to the example, the newDocumentBuilder() method for the factory
object should now return a validating and namespace aware parser. With a validating parser, we should
define an ErrorHandler object that will deal with parsing errors. You identify the ErrorHandler
object to the parser by calling the setErrorHandler() method for the DocumentBuilder object:
builder.setErrorHandler(handler);
Here handler refers to an object that implements the three methods declared in the
org.xml.sax.ErrorHandler interface. We discussed these in the previous chapter in the context of
SAX parser error handling, and the same applies here. If you do create a validating parser, you should
implement and register an ErrorHandler object. Otherwise the parser may not work properly.
The factory object has methods to check the status of parser features corresponding to each of the setXXX()
methods above. The checking methods all have corresponding names of the form isXXX() , so to check
whether a parser will be namespace aware, you call the isNamespaceAware() method. Each method
returns true if the parser to be created will have the feature set, and false otherwise.
Parsing a Document
Once you have created a DocumentBuilder object, you just call its parse() method with a
document source as an argument to parse a document. The parse() method will return a reference of
type Document to a object that encapsulates the entire XML document. The Document interface is
defined in the org.w3c.dom package.
There are five overloaded versions of the parse() method that provide various options for you to
identify the source of the XML document. They all return a reference to a Document object:
parse(File file)
Parses the document in the file identified by file .
parse(String uri)
Parses the document at the URI, uri .
parse(InputSource source)
Parses the document from source .
parse(InputStream stream)
Parses the document read from the input stream,
stream .
parse(InputStream stream,
String systemID)
Parses the document read from the input stream,
stream . The second argument, systemID , is used to
resolve relative URIs.
All five versions of the parse method can throw three types of exception. An exception of type
IllegalArgumentException will be thrown if you pass null to the method for the parameter that
identifies the document source. The method will throw an exception of type IOException if any I/O
error occurs, and of type SAXException in the event of a parsing error. Both these last exceptions
must be caught. Note that it is a SAXException that can be thrown here. Exceptions of type
DOMException only arise when you are navigating the tree for a Document object.
Search WWH ::




Custom Search