Java Reference
In-Depth Information
Document xmlDoc = null;
try (BufferedInputStream in = new
BufferedInputStream(Files.newInputStream(xmlFile))){
xmlDoc = builder.parse(in);
} catch(SAXException | IOException e) {
e.printStackTrace();
System.exit(1);
}
This creates a Path object for the file and creates an input stream for the file in the try block. Calling
parse() for the builder object with the input stream as the argument parses the XML file and returns it as
a Document object. Note that the entire XML file contents are encapsulated by the Document object, so in
practice this can require a lot of memory.
To compile this code you need import statements for the BufferedInputStream and IOException
names in the java.io package, and Paths , Path , and Files names in the java.nio.file package, as well
as the org.w3c.dom.Document class name. After this code executes, you can call methods for the xmlDoc
object to navigate through the elements in the document tree structure. Let's look at what the possibilities
are.
NAVIGATING A DOCUMENT OBJECT TREE
The org.w3c.dom.Node interface is fundamental to all objects that encapsulate components of an XML
document, and this includes the Document object itself. It represents a type that encapsulates a node in the
document tree. Node is also a super-interface of a number of other interfaces that declare methods for ac-
cessing document components. The subinterfaces of Node that identify components of a document are the
following:
Element : Represents an XML element.
Text : R epresents text that is part of element content. This is a subinterface of CharacterData ,
which is a subinterface of Node . Text references, therefore, have methods from all three interfaces.
CDATASection : Represents a CDATA section — unparsed character data. This extends Text .
Comment : Represents a document comment. This interface extends the CharacterData interface.
DocumentType : Represents the type of a document.
Document : Represents the entire XML document.
DocumentFragment : Represents a lightweight document object that encapsulates a subtree of a
document.
Entity : Represents an entity that may be parsed or unparsed.
EntityReference : Represents a reference to an entity.
Notation : Represents a notation declared in the DTD for a document. A notation is a definition
of an unparsed entity type.
ProcessingInstruction : Represents a processing instruction for an application.
Each of these interfaces declares its own set of methods and inherits the fields and methods declared in
the Node interface. Every XML document is modeled as a hierarchy of nodes that are accessible as one or
another of the interface types in the list. At the top of the node hierarchy for a document is the Document
node that is returned by the parse() method. Each type of node may or may not have child nodes in the
Search WWH ::




Custom Search