HTML and CSS Reference
There are two approaches for serving XHTML, both of which have their advantages and disadvantages. They are
described in the following sections.
Serving XHTML as HTML
In the early days of the Web, HTML was the exclusive markup language. After several years, new innovations appeared
that could not have been covered by HTML. XML rules have been added to HTML, creating XHTML, a new line of
markup languages. These best practices are the rules applied when converting HTML documents to XHTML, as
discussed earlier in Chapter 3.
However, the vocabulary of HTML 4.01 has been more or less preserved; thus, it is similar to that of XHTML 1.0.
Consequently, XHTML documents can be served as HTML to rendering engines. This approach provides backward
compatibility. Media types can be used to request browsers to handle XHTML as HTML instead of XML. If the media
type of an XHTML document is defined as text/html , the rendering engine will parse the web page as if it were
HTML. If the media type is given as application/xhtml+xml , browsers will process the document as XML.
Several server and server-side scripting platforms (PHP, ASP, and so on) apply the text/html media type for
web content by default. The “dirty secret” of XHTML is that several browsers with an XML parser treat documents
served as text/html with XHTML syntax and DOCTYPE as HTML. 3 But backward compatibility comes at a price: the
impressive features of XML cannot be used at all in XHTML served this way. And what is the point of applying strict
rules if documents cannot use their full potential? Where backward compatibility is not a major concern, the solution
is to serve XHTML as XML.
Serving XHTML as XML
While code quality strongly depends on markup structure and correctness, the reliability of rendering is also determined
by the browser. The browsers' behavior of refusing to render invalid XHTML markup might seem frustrating; however,
the browsers have a really good reason to do so. Browsers process those HTML documents that contain markup errors by
guessing the intentions of the content author or web designer, often resulting in undesirable layout and poor styling.
There are scenarios where errors cannot be tolerated. In scientific publishing, for example, the representation of
mathematical equations should be reliable. If such documents are published on the Web with MathML embedded in
XHTML, errors cannot be tolerated because the consequences can cost millions or be fatal. This is the main reason for
the extreme error sensitivity of XML parsers.
Being an XML language family, XHTML is meant to be served as XML to leverage all the benefits of XML.
However, it also involves a serious risk. Web documents served as application/xhtml+xml request browsers to
process them according to the rules of XML. Since invalid XHTML markup is not rendered at all in web browsers,
extended care should be taken when serving XHTML as XML. One simple character at the wrong location in the
source code results in an XML parsing error message instead of the web page content (as already hinted in Chapter 1).
This is one of the reasons why HTML has always been preferred by most content authors and web designers. However,
you should not be afraid of writing pure XHTML code. If you learn how to use the practices described in the previous
chapter, you will be able to create not only error-free XHTML documents but also any kind of structured markup.
Although modern browsers support the application/xhtml+xml MIME type, some older browsers do not. One of
the options to preserve backward compatibility with older browsers and support advanced XML applications for modern
ones is the technique called content negotiation . It can be done through .htaccess 4 settings or using server-side scripting
3 Real XML parsers such as that of Firefox or Safari consider the MIME type of documents (as sent by the server) rather than file
syntax and DOCTYPE only.
4 A common configuration file on web servers such as Apache. Note that the file begins with a period and has no extension.