HTML and CSS Reference
In-Depth Information
Replace Imaginary Entity References
Make sure all entity references used in the document are defined.
&copyright; 2007 TIC Corp.
© 2007 TIC Corp.
Motivation
Occasionally, authors begin to use entity references that simply don't exist. Sometimes it's a simple typo, such
as &apm; instead of & . Sometimes it's misremembered code, such as &tm; instead of ™ or
&copyright; instead of © . Either way, this causes display problems for all browsers and should be fixed.
Potential Trade-offs
None. This is only good.
Mechanics
The hardest problem is finding these imaginary entity references, because there's not necessarily any rhyme or
reason to them. Often, the first time you realize there's a problem is while browsing your site. If you're lucky it
will appear in the plain text like this:
&copyright; 2007 TIC Corp.
If not, the browser will just drop it out completely:
2007 TIC Corp.
The same mistakes do tend to repeat themselves, so once you've noticed a problem, a straight search and
replace will usually find and fix all other occurrences.
Otherwise, validation (or at least well-formedness checking) is necessary to identify these issues. Once a
validator finds such imaginary entity references, you can fix them by hand if they aren't too numerous, or with a
targeted search and replace if they are.
Occasionally, you'll find someone has invented an entity reference that perhaps should exist but doesn't: ¥
for ¥ or &bet; for the Hebrew letter . Although it's theoretically possible to define new entity references such
as these in the internal DTD subset or external DTD, I do not recommend this. XML parsers can handle this, but
browsers cannot. Either replace the references with the actual characters (especially if you already reencoded
the document in UTF-8) or use a numeric character reference such as ¥ or ב .
Search WWH ::




Custom Search