Java Reference
In-Depth Information
five character sequence. You could also include an encoding declaration following the version specification
in the prolog that specifies the Unicode encoding used in the document. For example:
<?xml version="1.0" encoding="UTF-8"?>
<proverb>Too many cooks spoil the broth.</proverb>
The first line states that as well as being XML version 1.0, the document uses the "UTF-8" Unicode
encoding. If you omit the encoding specification, "UTF-8" or "UTF-16" is assumed, and because “ UTF-8"
includes ASCII as a subset, you don't need to specify an encoding if all you are using is ASCII text. The
version and the character encoding specifications must appear in the order shown. If you reverse them you
have broken the rules, so the document is no longer well-formed.
If you want to specify that the document is not dependent on any external definitions of markup, you can
add a standalone specification to the prolog like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<proverb>Too many cooks spoil the broth.</proverb>
Specifying the value for standalone as "yes" indicates to an XML processor that the document is self-
contained; there is no external definition of the markup, such as a DTD. A value of "no" indicates that the
document is dependent on an external definition of the markup used, possibly in an external DTD.
Valid XML Documents
A valid XML document is a well-formed document that has an associated DTD (you learn more about cre-
ating DTDs later in this chapter). In a valid document the DTD must be consistent with the rules for creating
a DTD and the document body must be consistent with the DTD. A DTD essentially defines a markup lan-
guage for a given type of document and is identified in the DOCTYPE declaration in the document prolog. It
specifies how all the elements that may be used in the document can be structured, and the elements in the
body of the document must be consistent with it.
The previous example is well-formed, but not valid, because it does not have an associated DTD that
defines the <proverb> element. Note that there is nothing wrong with an XML document that is not valid.
It may not be ideal, but it is a perfectly legal XML document. Valid in this context is a technical term that
means only that a document has a DTD.
An XML processor may be validating or non-validating . A validating XML processor checks that an
XML document has a DTD and that its contents are correctly specified. It also verifies that the document is
consistent with the rules expressed in the DTD and reports any errors that it finds. A non-validating XML
processor does not check that the document is consistent with the DTD. As you later see, you can usually
choose whether the XML processor that you use to read a document is validating or non-validating simply
by switching the validating feature on or off.
Here's a variation on the example from the previous section with a document type declaration added:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE proverb SYSTEM "proverb.dtd">
<proverb>Too many cooks spoil the broth.</proverb>
A document type declaration always starts with <!DOCTYPE so it is easily recognized. The name that ap-
pears in the DOCTYPE declaration, in this case proverb , must always match that of the root element for the
Search WWH ::




Custom Search