Java Reference
In-Depth Information
Note Aswith Listings10-1 and 10-2 , Listing10-3 alsocontains whitespace (invis-
ible characters such as spaces, tabs, carriage returns, and line feeds). The XML spe-
cificationpermitswhitespacetobeaddedtoadocument.Whitespaceappearingwithin
content (such as spaces between words) is considered part of the content. In contrast,
theparsertypicallyignoreswhitespaceappearingbetweenanendtagandthenextstart
tag. Such whitespace is not considered part of the content.
AnXMLelement'sstarttagcancontainoneormoreattributes.Forexample, Listing
10-1 ' s <ingredient> tag has a qty (quantity) attribute, and Listing 10-3 ' s
<article> taghas title and lang attributes.Attributesprovideadditionalinform-
ationaboutelements.Forexample, qty identifiestheamountofaningredientthatcan
beadded, title identifiesanarticle'stitle,and lang identifiesthelanguageinwhich
thearticleiswritten( en forEnglish).Attributescanbeoptional.Forexample,if qty is
not specified, a default value of 1 is assumed.
Note Element and attribute names may contain any alphanumeric character from
English or another language, and may also include the underscore ( _ ), hyphen ( - ),
period( . ),andcolon( : )punctuation characters. Thecolonshouldonlybeusedwith
namespaces (discussed later in this chapter), and names cannot contain whitespace.
Character References and CDATA Sections
Certaincharacterscannotappearliterallyinthecontentthatappearsbetweenastarttag
and an end tag, or within an attribute value. For example, you cannot place a literal <
character betweenastarttagandanendtagbecausedoingsowouldconfuseanXML
parser into thinking that it had encountered another tag.
Onesolutiontothisproblemistoreplacetheliteralcharacterwitha character refer-
ence ,whichisacodethatrepresentsthecharacter.Characterreferencesareclassifiedas
numeric character references or character entity references:
• A numeric character reference referstoacharacterviaitsUnicodecodepoint,
andadherestotheformat &#nnnn; (notrestrictedtofourpositions)or &#xh-
hhh; (notrestrictedtofourpositions),where nnnn providesadecimalrepres-
entationofthecodepointand hhhh providesahexadecimalrepresentation.For
example, &#0931; and &#x03A3; represent the Greek capital letter sigma.
AlthoughXMLmandatesthatthe x in &#x hhhh ; belowercase,itisflexiblein
thattheleadingzeroisoptionalineitherformat,andinallowingyoutospecify
Search WWH ::




Custom Search