Databases Reference
In-Depth Information
•
the document must have a
root
element, containing all other elements in
the document;
•
all open tags must have a corresponding closed tag, provided it is not an
empty tag;
•
elements must be properly nested;
•
tags are case-sensitive;
•
attribute values must be quoted.
An
XML language
is a set of XML documents that are characterized by a
syntax, which describes the markup tags that the language uses and how they
can be combined, together with its semantics. A
schema
is a formal definition
of the syntax of an XML language, and is usually expressed through a schema
language. The most common schema languages, and on which we focus our
attention, are
DTD
and
XML Schema
, both originating from W3C.
Document Type Definition.
A DTD document may be either internal or external to an XML document
and it is not itself written in the XML notation.
A DTD schema consists of definition of elements, attributes, and other
constructs. An element declaration is of the form
<!ELEMENT
element name
content
>
, where
element name
is an element name and
content
is the de-
scription of the content of an element and can assume one of the following
alternatives:
•
the element contains parsable character data (
#PCDATA
);
•
the element has no content (
Empty
);
•
the element may have any content (
Any
);
•
the element contains a group of one or more subelements, which in turn
may be composed of other subelements;
•
the element contains parsable character data, interleaved with subele-
ments.
When an element contains other elements (i.e., subelements or mixed con-
tent), it is necessary to declare the subelements composing it and their organi-
zation. Specifically, sequences of elements are separated by a comma “
,
”and
alternative elements are separated by a vertical bar “
”. Declarations of se-
quence and choices of subelements need to describe subelements' cardinality.
With a notation inspired by extended BNF grammars, “
*
” indicates zero or
more occurrences, “
+
” indicates one or more occurrences, “
?
” indicates zero
or one occurrence, and no label indicates exactly one occurrence.
An attribute declaration is of the form
<!ATTLIST
element name at-
tribute def
>
, where
element name
is the name of an element, and
attribute def
is a list of attribute definitions that, for each attribute, specify the at-
tribute name, type, and possibly default value. Attributes can be marked
as
#REQUIRED
, meaning that they must have an explicit value for each occur-
rence of the elements with which they are associated;
#IMPLIED
, meaning that
|