Serving and Configuration - Web Standards: Mastering HTML5, CSS3, and XML

HTML and CSS Reference

In-Depth Information

Serving XHTML

There are two approaches for serving XHTML, both of which have their advantages and disadvantages. They are

described in the following sections.

Serving XHTML as HTML

In the early days of the Web, HTML was the exclusive markup language. After several years, new innovations appeared

that could not have been covered by HTML. XML rules have been added to HTML, creating XHTML, a new line of

markup languages. These best practices are the rules applied when converting HTML documents to XHTML, as

discussed earlier in Chapter 3.

However, the vocabulary of HTML 4.01 has been more or less preserved; thus, it is similar to that of XHTML 1.0.

Consequently, XHTML documents can be served as HTML to rendering engines. This approach provides backward

compatibility. Media types can be used to request browsers to handle XHTML as HTML instead of XML. If the media

type of an XHTML document is defined as text/html , the rendering engine will parse the web page as if it were

HTML. If the media type is given as application/xhtml+xml , browsers will process the document as XML.

Several server and server-side scripting platforms (PHP, ASP, and so on) apply the text/html media type for

web content by default. The “dirty secret” of XHTML is that several browsers with an XML parser treat documents

served as text/html with XHTML syntax and DOCTYPE as HTML. 3 But backward compatibility comes at a price: the

impressive features of XML cannot be used at all in XHTML served this way. And what is the point of applying strict

rules if documents cannot use their full potential? Where backward compatibility is not a major concern, the solution

is to serve XHTML as XML.

Serving XHTML as XML

While code quality strongly depends on markup structure and correctness, the reliability of rendering is also determined

by the browser. The browsers' behavior of refusing to render invalid XHTML markup might seem frustrating; however,

the browsers have a really good reason to do so. Browsers process those HTML documents that contain markup errors by

guessing the intentions of the content author or web designer, often resulting in undesirable layout and poor styling.

There are scenarios where errors cannot be tolerated. In scientific publishing, for example, the representation of

mathematical equations should be reliable. If such documents are published on the Web with MathML embedded in

XHTML, errors cannot be tolerated because the consequences can cost millions or be fatal. This is the main reason for

the extreme error sensitivity of XML parsers.

Being an XML language family, XHTML is meant to be served as XML to leverage all the benefits of XML.

However, it also involves a serious risk. Web documents served as application/xhtml+xml request browsers to

process them according to the rules of XML. Since invalid XHTML markup is not rendered at all in web browsers,

extended care should be taken when serving XHTML as XML. One simple character at the wrong location in the

source code results in an XML parsing error message instead of the web page content (as already hinted in Chapter 1).

This is one of the reasons why HTML has always been preferred by most content authors and web designers. However,

you should not be afraid of writing pure XHTML code. If you learn how to use the practices described in the previous

chapter, you will be able to create not only error-free XHTML documents but also any kind of structured markup.

Although modern browsers support the application/xhtml+xml MIME type, some older browsers do not. One of

the options to preserve backward compatibility with older browsers and support advanced XML applications for modern

ones is the technique called content negotiation . It can be done through .htaccess 4 settings or using server-side scripting

languages.

3 Real XML parsers such as that of Firefox or Safari consider the MIME type of documents (as sent by the server) rather than file

syntax and DOCTYPE only.

4 A common configuration file on web servers such as Apache. Note that the file begins with a period and has no extension.

Web Standards: Mastering HTML5, CSS3, and XML

Search WWH ::

Custom Search

Home