Java and XML - Beginning Java 2 SDK

Java Reference

In-Depth Information

Elements in an XML Document

XML markup divides the contents of a document up into elements by enclosing segments of the data

between tags. As we said, there will always be one root element that contains all the other elements in a

document. In the example above, the following is an element:

<proverb>Every dog has his day.</proverb>

In this case this is the only element and is therefore the root element. A start tag , <proverb> ,

indicates the beginning of an element, and an end tag , </proverb> , marks its end. The name of the

element, proverb in this case, always appears in both the start and end tags. The text between the start

and end tags for an element is referred to as element content and in general may consist of just data,

which is referred to as character data , other elements, which is described as markup , or a combination

of character data and markup, or it may be empty. The latter is referred to as an empty element .

When an element contains plain text, then the content is described as parsed character data (PCDATA ). This

means that the XML processor will parse it - it will analyze it in other words - looking to see if it can be

broken down further. In fact PCDATA allows for a mixture of ordinary data and other elements, referred to as

mixed content , so a parser will be looking for the characters that delimit the start and end of markup tags.

Consequently, ordinary text must not contain characters that might cause it to be recognized as a tag. Thus

you can't include < or & characters explicitly as part of the text within an element, for instance. Since it could

be a little inconvenient to completely prohibit such characters within ordinary text, you can include them

using predefined entities when you need to. XML recognizes the following predefined entities that represent

characters that would otherwise be recognized as part of markup:

&

'

"

<

>

Here's an element that makes use of a predefined entity:

<text> This is parsed character data within a <text>

element'</text>

The content of this element is the string:

This is parsed character data within a <text> element.

Here's an example of an XML document containing several elements:

<?xml version="1.0"?>

<street> South Lasalle Street</street>

<city>Chicago</city>

<state>Illinois</state>

</address>

Search WWH ::

Custom Search

Home