HTML and CSS Reference
In-Depth Information
In-Document Declarations
Character encoding can be set by the
@charset
at-rule
with the syntax shown in Listing 2-6.
Listing 2-6.
Syntax of the
@charset
At-Rule
@charset "<charset-name>";
Only one
@charset
rule can be used per CSS file. It should be declared at the very beginning of the file. No
characters should precede the declaration (only BOM if the CSS file is Unicode encoded
7
).
The
charset-name
can be one of the character sets defined by IANA [15]. Some encodings have multiple names
in the IANA registry (the one marked as preferred should be used). Listing 2-7 shows a typical example for character
encoding declaration of external CSS files.
Listing 2-7.
Setting the Character Encoding of CSS with an At-Rule
@charset "UTF-8";
These rules can be used only in external style sheets. In-document style sheet declarations cannot use
@charset
rules.
The HTML 4.01 specification defined a
charset
attribute to the
link
element for identifying the character
encoding of the target document. In HTML5, however, this attribute is obsolete and should not be used.
Escape Codes, Special Characters, and Symbols
In HTML and XHTML documents, each character can be typed in directly or represented by a character sequence
(also known as a
character reference
). Two types of character sequences exist:
numeric character references
and
character entity references
.
Assume a document fragment contains an
a
character with an accent (
á
). This character can be declared by either
the
á
or
á
numeric character references or by the
á
entity reference in (X)HTML documents
(see the following sections for details). However, the best practice is to type in the
á
character directly in the markup. The
same is true for the copyright sign (
©
instead of
©
), the registered trademark sign (
®
instead of
®
), and so on.
Characters should always be preferred to escape codes unless they are special characters with syntactic meaning
in (X)HTML or XML, or characters that are invisible or ambiguous. In such cases, using entities is mandatory [16]. In
other words, markup characters used in textual content or attribute values must be
escaped
. For example, when we
demonstrate (X)HTML source code blocks on a web page and want to avoid processing, the
<
and
>
characters should
be provided by their entity names (
<
and
>
) in the source code rather than typing them in directly. Analogously,
if an
&
character is needed as text within an RSS feed or an RDF file, the
&
entity should be used instead (see the
“Entity References” section for more information).
Numeric References
Numeric character references identify characters by
Universal Character Set
or
Unicode codepoints
in the form
&#
nnnn
;
where
nnnn
is the codepoint in decimal form. Both HTML and XHTML support
hexadecimal references
as well. In
HTML, they can be applied in either the
&#
Xhhhh
;
or
&#
xhhhh
;
form. Since XML is case sensitive, in XHTML they must
be in lowercase (
&#
xhhhh
;
) [17]. The
nnnn
or
hhhh
can be any number of digits and may include leading zeros.
7
External
CSS files are usually encoded in US-ASCII.
Search WWH ::
Custom Search