Java and XML - Beginning Java

Java Reference

In-Depth Information

Element Names

If you're going to be creating elements then you're going to have to give them names, and XML is very gen-

erous in the names you're allowed to use. For example, there aren't any reserved words to avoid in XML, as

there are in most programming languages, so you do have a lot of flexibility in this regard. However, there

are certain rules that you must follow. The names you choose for elements must begin with either a letter

or an underscore and can include digits, periods, and hyphens. Here are some examples of valid element

names:

net_price Gross-Weight _sample clause_3.2 pastParticiple

In theory you can use colons within a name but because colons have a special purpose in the context of

names, you should not do so. XML documents use the Unicode character set, so any of the national language

alphabets defined within that set may be used for names. HTML users need to remember that tag names in

XML are case-sensitive, so <Address> is not the same as <address> .

Note also that names starting with uppercase or lowercase x followed by m followed by l are reserved, so

you must not define names that begin xml or XmL or any of the other six possible sequences.

Defining General Entities

There is a frequent requirement to repeat a given block of parsed character data in the body of a document.

An obvious example is some kind of copyright notice that you may want to insert in various places. You can

define a named block of parsed text like this:

This is an example of declaration of a general entity . You can put declarations of general entities within

a DOCTYPE declaration in the document prolog or within an external DTD. I describe how a little later in this

chapter. The block of text that appears between the double quotes is identified by the name copyright . You

could equally well use single quotes as delimiters for the string. Wherever you want to insert this text in the

document, you just need to insert the name delimited by an ampersand at the beginning and a semicolon at

the end, thus:

&copyright;

This is called an entity reference . This is exactly the same notation as the predefined entities representing

markup characters that you saw earlier. It causes the equivalent text to be inserted at this point when the

document is parsed. A general entity is parsed text, so you need to take care that the document is still well-

formed and valid after the substitution has been made.

An entity declaration can include entity references. For example, I could declare the copyright entity

like this:

The text contains a reference to a documentDate entity. Entity references may appear in a document only

after their corresponding entity declarations, so the declaration for the documentDate entity must precede

the declaration for the copyright entity:

Search WWH ::

Custom Search

Home