Java Reference
In-Depth Information
The declaration may contain additional information identifying the character set or encoding the
presence or absence of additional reference documents such as a document type definition and other
related information:
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
In this example, the header contains the following information:
The XML version number is 1.0
The character encoding is ISO-8859-1, the HTTP default character encoding
The documentis a standalone document, or one which requires no supporting documents such as
an external Document Type Definition
Everything that comes after the XML header constitutes the document's content.
Note
The XML version attribute is required. An XML parser will report an error if no version
attribute is supplied.
Tags and Attributes
The tags in the example of
Listing 17-1
identify the content as a whole, as well as the individual
elements: the contact's first name, last name, street, city, and zip. These data elements are contained in
a hierarchical structure, defined by nesting them inside the
<CONTACT_INFO>
tag. The capability of one
tag to contain others permits XML to represent hierarchical data structures
The format of an XML document on the printed page is largely a matter of convenience. As is the case
with HTML, whitespace is not considered significant.
In addition to the tag name, XML tags can contain attributes within the tag's angle brackets. Attributes,
as in HTML, are generally used to provide additional information about an element. A good example of
a tag with attributes is the HTML
<FONT>
tag shown here, which contains attributes describing the font
face, size, and color:
<FONT FACE="Arial" SIZE="3" COLOR="#0000FF">Hello World</FONT>
As in HTML, attributes are defined as key = value pairs, separated by spaces. Unlike HTML, however,
XML requires that attribute values be quoted, separated only by whitespace. In other words, the FONT
tag shown above complies with the requirements for defining xml attributes, while the example below,
which works fine as HTML, is invalid as XML:
<FONT FACE=Arial SIZE=3 COLOR=#0000FF>Hello World</FONT>
Since you can design a data structure like <message> equally well using either attributes or tags, it can
take a considerable amount of thought to figure out which design is best for your purposes. The last part
of this tutorial, "Designing an XML Data Structure," includes ideas to help you decide when to use
attributes and when to use tags.
Elements and Nodes
The DOM represents an XML document as a tree structure, where each node contains one of the
components of the XML document. Using DOM methods, you can create and remove nodes, change
their contents, and traverse the node hierarchy.
The DOM defines a number of different types of nodes in the
org.w3c.dom.Node
interface. The most
commonly used of these are summarized in
Table 17-1
.
Table 17-1:
org.w3c.dom Interface Node
org.w3c.dom Node_Type
Application
Example
key="value"
ATTRIBUTE_NODE
Attribute
COMMENT_NODE
Comment
<-- This is a comment --