Java Reference
In-Depth Information
hierarchy, and those that do can have only certain types of child nodes. The types of nodes in a document
that can have children are shown in Table 23-1 :
TABLE 23-1 : Nodes that Can Have Children
NODE TYPE POSSIBLE CHILDREN
Document Element (only 1), DocumentType (only 1), Comment , ProcessingInstruction
Element Element , Text , Comment , CDATASection , EntityReference , ProcessingInstruction
Attr Text , EntityReference
Entity Element , Text , Comment , CDATASection , EntityReference , ProcessingInstruction
EntityReferenceElement , Text , Comment , CDATASection , EntityReference , ProcessingInstruction
Of course, what each node may have as children follows from the XML specification, not just the DOM
specification. There is one other type of node that extends the Node interface — DocumentFragment . This
is not formally part of a document in the sense that a node of this type is a programming convenience. It is
used to house a fragment of a document — a subtree of elements — for use when moving fragments of a
document around, for example, so it provides a similar function to a Document node but with less overhead.
A DocumentFragment node can have the same range of child nodes as an Element node.
The starting point for exploring the entire document tree is the root element for the document. You can
obtain a reference to an object that encapsulates the root element by calling the getDocumentElement()
method for the Document object:
Element root = xmlDoc.getDocumentElement();
This method returns the root element for the document as type Element . You can also get the node cor-
responding to the DOCTYPE declaration as type DocumentType like this:
DocumentType doctype = xmlDoc.getDoctype();
If there is no DOCTYPE declaration, or the parser cannot find the DTD for the document, the getDocType()
method returns null . If the value returned is not null , you can obtain the contents of the DTD as a string
by calling the getInternalSubset() method for the DocumentType object:
System.out.println("Document type:\n" + doctype.getInternalSubset());
This statement outputs the contents of the DTD for the document.
After you have an object encapsulating the root element for a document, the next step is to obtain its child
nodes. You can use the getChildNodes() method that is defined in the Node interface for this. This method
returns a org.w3c.dom.NodeList reference that encapsulates all the child elements for that element. You
can call this method for any node that has children, including the Document node, if you wish. You can
therefore obtain the child elements for the root element with the following statement:
NodeList children = root.getChildNodes();
A NodeList reference encapsulates an ordered collection of Node references, each of which will be one
of the possible node types for the current node. So with an Element node, any of the Node references
in the list that is returned can be of type Element , Text , Comment , CDATASection , EntityReference , or
ProcessingInstruction . Note that if there are no child nodes, the getChildNodes() method returns a
 
 
Search WWH ::




Custom Search