HTML and CSS Reference
In-Depth Information
HTML5: Embracing the Reality of Web Markup
Given the looseness HTML5 supports and its de-emphasis of the XML approach to markup,
you might assume that HTML5 is a retreat from doing things in the right way and an
acceptance of “tag soup” as legitimate markup. The harsh reality is that, indeed, valid
markup is more the exception than the rule online. Numerous surveys have shown that in
the grand scheme of things, few Web sites validate. For example, in a study of the Alexa
Global Top 500 in January 2008, only 6.57 percent of the sites surveyed validated. 1 When
sample sizes are increased and we begin to look at sites that are not as professional, things
actually get worse. Some validation results from Opera's larger MAMA (Metadata Analysis
and Mining Application) study are shown here 2 :
Interestingly, Google has even larger studies, and while they don't focus specifically on
validation, what they indicate on tag usage indicates clearly that no matter the sample size,
clean markup is more the exception than the rule.
Yet despite the markup madness, the Web continues to work. In fact, some might say the
permissive nature of browsers that parse junk HTML actually helps the Web grow because it
lowers the barrier to entry for new Web page authors. Certainly a shaky foundation to build
upon, but the stark reality is that we must deal with malformed markup. To this end, HTML5
makes one very major contribution: it defines what to do in the presence of markup syntax
problems.
The permissive nature of browsers is required for browsers to fix markup mistakes. HTML5
directly acknowledges this situation and aims to define how browsers should parse both well-
formed and malformed markup, as indicated by this brief excerpt from the specification:
This specification defines the parsing rules for HTML documents, whether they
are syntactically correct or not. Certain points in the parsing algorithm are said
to be parse errors . The error handling for parse errors is well-defined: user agents
must either act as described below when encountering such problems, or must
abort processing at the first error that they encounter for which they do not wish
to apply the rules described below.
While a complete discussion of the implementation of an HTML5-compliant browser
parser is of little interest to Web document authors, browser implementers now have a
common specification to consult to determine what to do when tags are not nested, simply
left open, or mangled in a variety of ways. This is the part of the HTML5 specification that
1 Brian Wilson, “MAMA W3C Validator Research,” subsection “Interesting Views of Validation Rates, part 2:
Alexa Global Top 500,” Dev.Opera, October 15, 2008, http://dev.opera.com/articles/view/mama-w3c-
validator-research-2/?page=2#alexalist.
2 Ibid., subsection “How Many Pages Validated?” http://dev.opera.com/articles/view/mama-w3c-
validator-research-2/#validated.
 
Search WWH ::




Custom Search