Java Reference
In-Depth Information
Elements in an XML Document
XML markup divides the contents of a document up into elements by enclosing segments of the data
between tags. As we said, there will always be one root element that contains all the other elements in a
document. In the example above, the following is an element:
<proverb>Every dog has his day.</proverb>
In this case this is the only element and is therefore the root element. A start tag , <proverb> ,
indicates the beginning of an element, and an end tag , </proverb> , marks its end. The name of the
element, proverb in this case, always appears in both the start and end tags. The text between the start
and end tags for an element is referred to as element content and in general may consist of just data,
which is referred to as character data , other elements, which is described as markup , or a combination
of character data and markup, or it may be empty. The latter is referred to as an empty element .
When an element contains plain text, then the content is described as parsed character data (PCDATA ). This
means that the XML processor will parse it - it will analyze it in other words - looking to see if it can be
broken down further. In fact PCDATA allows for a mixture of ordinary data and other elements, referred to as
mixed content , so a parser will be looking for the characters that delimit the start and end of markup tags.
Consequently, ordinary text must not contain characters that might cause it to be recognized as a tag. Thus
you can't include < or & characters explicitly as part of the text within an element, for instance. Since it could
be a little inconvenient to completely prohibit such characters within ordinary text, you can include them
using predefined entities when you need to. XML recognizes the following predefined entities that represent
characters that would otherwise be recognized as part of markup:
&
&amp;
'
&apos;
"
&quot;
<
&lt;
>
&gt;
Here's an element that makes use of a predefined entity:
<text> This is parsed character data within a &lt;text&gt;
element&apos;</text>
The content of this element is the string:
This is parsed character data within a <text> element.
Here's an example of an XML document containing several elements:
<?xml version="1.0"?>
<address>
<buildingnumber>29</buildingnumber>
<street> South Lasalle Street</street>
<city>Chicago</city>
<state>Illinois</state>
<zip>60603</zip>
</address>
Search WWH ::




Custom Search