Java Reference
In-Depth Information
An XML processor is a software module that is used by an application to read an XML document and
gain access to the data and its structure. An XML processor also determines whether an XML document
is well-formed or not. Processing instructions are passed through to an application without any checking
or analysis by the XML processor. The XML specification describes how an XML processor should
behave when reading XML documents, including what information should be made available to an
application for various types of document content.
Here's an example of a well-formed XML document:
<proverb>Too many cooks spoil the broth.</proverb>
The document just consists of a root element that defines a proverb. There is no prolog and, formally,
you don't have to supply one, but it would be much better if the document did include at least the XML
version that is applicable, like this:
<?xml version="1.0"?>
<proverb>Too many cooks spoil the broth.</proverb>
The first line is the prolog and it consists of just an XML declaration, which specifies that the document
is consistent with XML version 1.0. The XML declaration must start with <?xml with no spaces within
this five character sequence. We could also include an encoding declaration following the version
specification in the prolog. For example:
<?xml version="1.0" encoding="UTF-8"?>
<proverb>Too many cooks spoil the broth.</proverb>
The first line states that as well as being XML version 1.0, the document uses the "UTF-8" Unicode
encoding. If you omit the encoding specification, "UTF-8" or "UTF-16" will be assumed, and since
"UTF-8" includes ASCII as a subset, you don't need to specify an encoding if all you are using is
ASCII text. The version and the character encoding specifications must appear in the order shown. If
you reverse them you have broken the rules so the document would no longer be well-formed.
If we want to specify that the document is not dependent on any external definitions of markup, we can
add a standalone specification to the prolog like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<proverb>Too many cooks spoil the broth.</proverb>
Specifying the value for standalone as "yes" indicates to an XML processor that the document is
self-contained. A value of "no" would indicate that the document is dependent on an external
definition of the markup used.
Search WWH ::




Custom Search