HTML and CSS Reference
Remove All Nonexistent Tags
Modern browsers do not support a lot of the old, deprecated, vendor-proprietary tags such as marquee and
multicol introduced in the wild and wooly early days of the Web. If still relevant, these should be replaced by
standard tags and CSS stylesheets. If not, they should be deleted to save space and simplify documents.
Older browsers that actually depend on these tags may see a slightly less formatted page. For example, old
versions of Netscape will no longer see two columns on a page after you replace a multicol element with CSS.
However, today many more browsers don't support the multicol element than do. You'll improve the
experience for a lot more people than you'll degrade it.
Regardless of what changes you make, all the actual content of the page should still be present and accessible.
It may just be formatted a little differently. This will be improved with CSS later.
Chances are there aren't a lot of bogons in your documents. However, if one does show up, it's worth searching
for it across more of the site. You'll usually find the first one by validation. For example, here's xmllint
complaining about an unrecognized multicol element:
$ xmllint --valid --noout document.html
valid.html:18: element multicol: validity error : No
declaration for element multicol
valid.html:20: element body: validity error : Element body
content does not follow the DTD, expecting (p | h1 | h2 | h3 | h4 | h5 |
h6 | div | ul | ol | dl | pre | hr |
blockquote | address | fieldset | table | form | noscript
| ins | del | script)*, got (h1 multicol )
Notice that it complains twice: once to tell you that there's no declaration for the multicol element and once to
tell you that multicol is not a legal child of its parent body element.
Where there's one bogon, there are usually more. Once I noticed that someone had added multicol elements
to one page, I'd do a quick search for <multicol across the entire document tree. Any pages where that phrase
pops up are worth a closer look. In this case, there's no good CSS equivalent for multicolumn layouts, so we'll
probably just remove the tags. (They haven't worked in most browsers for years any-how.) Just replace
<multicol> and </multicol> with the empty string. If the multicol elements have attributes, you can search
for the regular expression <multicol\s*[^>]*> instead.
Here are some other elements you may find in your documents that you'll want to do away with: