Creating and Modifying XML Documents - Beginning Java

Java Reference

In-Depth Information

hierarchy, and those that do can have only certain types of child nodes. The types of nodes in a document

that can have children are shown in Table 23-1 :

TABLE 23-1 : Nodes that Can Have Children

NODE TYPE POSSIBLE CHILDREN

Document Element (only 1), DocumentType (only 1), Comment , ProcessingInstruction

Element Element , Text , Comment , CDATASection , EntityReference , ProcessingInstruction

Attr Text , EntityReference

Entity Element , Text , Comment , CDATASection , EntityReference , ProcessingInstruction

EntityReferenceElement , Text , Comment , CDATASection , EntityReference , ProcessingInstruction

Of course, what each node may have as children follows from the XML specification, not just the DOM

specification. There is one other type of node that extends the Node interface — DocumentFragment . This

is not formally part of a document in the sense that a node of this type is a programming convenience. It is

used to house a fragment of a document — a subtree of elements — for use when moving fragments of a

document around, for example, so it provides a similar function to a Document node but with less overhead.

A DocumentFragment node can have the same range of child nodes as an Element node.

The starting point for exploring the entire document tree is the root element for the document. You can

obtain a reference to an object that encapsulates the root element by calling the getDocumentElement()

method for the Document object:

Element root = xmlDoc.getDocumentElement();

This method returns the root element for the document as type Element . You can also get the node cor-

responding to the DOCTYPE declaration as type DocumentType like this:

DocumentType doctype = xmlDoc.getDoctype();

If there is no DOCTYPE declaration, or the parser cannot find the DTD for the document, the getDocType()

method returns null . If the value returned is not null , you can obtain the contents of the DTD as a string

by calling the getInternalSubset() method for the DocumentType object:

System.out.println("Document type:\n" + doctype.getInternalSubset());

This statement outputs the contents of the DTD for the document.

After you have an object encapsulating the root element for a document, the next step is to obtain its child

nodes. You can use the getChildNodes() method that is defined in the Node interface for this. This method

returns a org.w3c.dom.NodeList reference that encapsulates all the child elements for that element. You

can call this method for any node that has children, including the Document node, if you wish. You can

therefore obtain the child elements for the root element with the following statement:

NodeList children = root.getChildNodes();

A NodeList reference encapsulates an ordered collection of Node references, each of which will be one

of the possible node types for the current node. So with an Element node, any of the Node references

in the list that is returned can be of type Element , Text , Comment , CDATASection , EntityReference , or

ProcessingInstruction . Note that if there are no child nodes, the getChildNodes() method returns a

Search WWH ::

Custom Search

Home