Java Reference
In-Depth Information
Well-Formed Documents
HTML is a sloppy language in which elements can be specified out of order, end tags
canbeomitted,andsoon.Thecomplexityofawebbrowser'spagelayoutcodeispartly
due to the need to handle these special cases. In contrast, XML is a much stricter lan-
guage.TomakeXMLdocumentseasiertoparse,XMLmandatesthatXMLdocuments
follow certain rules:
All elements must either have start and end tags or consist of empty-element
tags . For example, unlike the HTML <p> tag that is often specified without
a </p> counterpart, </p> mustalsobepresent fromanXMLdocument per-
spective.
Tags must be nested correctly . For example, while you'll probably get away
withspecifying <b><i>JavaFX</b></i> inHTML,anXMLparserwould
report an error. In contrast, <b><i>JavaFX</i></b> doesn't result in an
error.
All attribute values must be quoted . Either single quotes ( ' ) or double quotes
( " ) are permissible (although double quotes are the more commonly specified
quotes). It is an error to omit these quotes.
Empty elements must be properly formatted . For example, HTML's <br> tag
wouldhavetobespecifiedas <br/> inXML.Youcanspecifyaspacebetween
the tag's name and the / character, although the space is optional.
Be careful with case .XMLisacase-sensitivelanguageinwhichtagsdiffering
incase(suchas <author> and <Author> )areconsidereddifferent.Itisan
errortomixstartandendtagsofdifferentcases,forexample, <author> with
</Author> .
XML parsers that are aware of namespaces enforce two additional rules:
• Allelementandattributenamesmustnotincludemorethanonecoloncharacter.
• No entity names, processing instruction targets, or notation names (discussed
later) can contain colons.
An XML document that conforms to these rules is well formed . The document has
a logical and clean appearance, and is much easier to process. XML parsers will only
parse well-formed XML documents.
Search WWH ::




Custom Search