HTML and CSS Reference
Character encoding can be set by the @charset at-rule with the syntax shown in Listing 2-6.
Listing 2-6. Syntax of the @charset At-Rule
Only one @charset rule can be used per CSS file. It should be declared at the very beginning of the file. No
characters should precede the declaration (only BOM if the CSS file is Unicode encoded 7 ).
The charset-name can be one of the character sets defined by IANA . Some encodings have multiple names
in the IANA registry (the one marked as preferred should be used). Listing 2-7 shows a typical example for character
encoding declaration of external CSS files.
Listing 2-7. Setting the Character Encoding of CSS with an At-Rule
These rules can be used only in external style sheets. In-document style sheet declarations cannot use
The HTML 4.01 specification defined a charset attribute to the link element for identifying the character
encoding of the target document. In HTML5, however, this attribute is obsolete and should not be used.
Escape Codes, Special Characters, and Symbols
In HTML and XHTML documents, each character can be typed in directly or represented by a character sequence
(also known as a character reference ). Two types of character sequences exist: numeric character references and
character entity references .
Assume a document fragment contains an a character with an accent ( á ). This character can be declared by either
the á or á numeric character references or by the á entity reference in (X)HTML documents
(see the following sections for details). However, the best practice is to type in the á character directly in the markup. The
same is true for the copyright sign ( © instead of © ), the registered trademark sign ( ® instead of ® ), and so on.
Characters should always be preferred to escape codes unless they are special characters with syntactic meaning
in (X)HTML or XML, or characters that are invisible or ambiguous. In such cases, using entities is mandatory . In
other words, markup characters used in textual content or attribute values must be escaped . For example, when we
demonstrate (X)HTML source code blocks on a web page and want to avoid processing, the < and > characters should
be provided by their entity names ( < and > ) in the source code rather than typing them in directly. Analogously,
if an & character is needed as text within an RSS feed or an RDF file, the & entity should be used instead (see the
“Entity References” section for more information).
Numeric character references identify characters by Universal Character Set or Unicode codepoints in the form &# nnnn ;
where nnnn is the codepoint in decimal form. Both HTML and XHTML support hexadecimal references as well. In
HTML, they can be applied in either the &# Xhhhh ; or &# xhhhh ; form. Since XML is case sensitive, in XHTML they must
be in lowercase ( &# xhhhh ; ) . The nnnn or hhhh can be any number of digits and may include leading zeros.
7 External CSS files are usually encoded in US-ASCII.