Java Reference
In-Depth Information
Valid XML Documents
A valid XML document is a well-formed document that has an associated D ocument T ype D efinition or
DTD (we will learn more about DTDs later in this chapter). In a valid document the DTD must be
consistent with the rules for creating a DTD and the document body must be consistent with the DTD.
A DTD essentially defines a markup language for a given type of document and is identified in the
DOCTYPE declaration in the document prolog. It specifies how all the elements that may be used in the
document can be structured, and the elements in the body of the document must be consistent with it.
The previous example is well-formed, but not valid, since it does not have an associated DTD that
defines the <proverb> element. Note that there is nothing wrong with an XML document that is not
valid. It may not be ideal, but it is a perfectly legal XML document. Valid in this context is a technical
term that only means that a document does not have a DTD.
An XML processor may be validating or non-validating . A validating XML processor will check that an
XML document has a DTD and that its contents are correctly specified. It will also verify that the
document is consistent with the rules expressed in the DTD and report any errors that it finds. A non-
validating XML processor will not check that the document body is consistent with the DTD. As we
shall see, you can usually choose whether the XML processor that you use to read a document is
validating or non-validating simply by switching the validating feature on or off.
Here's a variation on the example from the previous section with a document type declaration added:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE proverb SYSTEM "proverb.dtd">
<proverb>Too many cooks spoil the broth.</proverb>
A document type declaration always starts with <!DOCTYPE so it is easily recognized. The name that appears
in the DOCTYPE declaration, in this case proverb , must always match that of the root element for the
document. We have specified the value for standalone as " no ", but it would still be correct if we left it out
because the default value for standalone is " no " if there are external markup declarations in the document.
The DOCTYPE declaration indicates that the markup used in this document can be found in the DTD at the
URI proverb.dtd . We will see a lot more about the DOCTYPE declaration later in this chapter.
Having an external DTD for documents of a given type does not eliminate all the problems that may arise
when exchanging data. Obviously confusion may arise when several people independently create DTDs for
the same type of document. My DTD for documents containing sketches created by Sketcher is unlikely to be
the same as yours. Other people with sketching applications may be inventing their versions of a DTD for
representing a sketch so the potential for conflicting definitions for markup is considerable. To obviate the
difficulties that this sort of thing would cause, standard markup languages are being developed in XML that
can be used universally for documents of common types. For instance, the Mathematical Markup Language
(MATHML) is a language defined in XML for mathematical documents and the Synchronized Multimedia
Integration Language (SMIL) is a language for creating documents that contain multimedia presentations.
There is also the Scalable Vector Graphics (SVG) language for representing 2D graphics such as design
drawings or even sketches created by Sketcher.
Let's understand a bit more about what XML markup consists of.
Search WWH ::




Custom Search