HTML and CSS Reference
XSLT (Extensible Stylesheet Language Transformations) is one of many XML tools that work well on HTML
documents once they have first been converted into well-formed XHTML. In fact, it is one of my favorite such
tools, and the first thing I turn to for many tasks. For instance, I use it to automatically generate a lot of
content, such as RSS and Atom feeds, by screen-scraping my HTML pages. Indeed, the possibility of using XSLT
on my documents is one of my main reasons for refactoring documents into well-formed XHTML. XSLT can
query documents for things you need to fix and automate some of the fixes.
When refactoring XHTML with XSLT, you usually leave more alone than you change. Thus, most refactoring
stylesheets start with the identity transformation shown in Listing 2.9 .
Listing 2.9. The Identity Transformation in XSLT
This merely copies the entire document from the input to the output. You then modify this basic stylesheet with
a few extra rules to make the changes you desire. For example, suppose you want to change all the deprecated
<i> elements to <em> elements. You would add this rule to the stylesheet:
Notice that the XPath expression in the match attribute must use a namespace prefix, even though the element
it's matching uses the default namespace. This is a common source of confusion when transforming XHTML
documents. You always have to assign the XHTML namespace a prefix when you're using it in an XPath
Several good introductions to XSLT are available in print and on the Web. First, I'll recommend two I've
written myself. Chapter 15 of The XML 1.1 Bible (Wiley, 2003) covers XSLT in depth and is available on
the Web at www.cafeconleche.org/books/bible3/chapters/ch15.html . XML in a Nutshell , 3rd Edition, by
Elliotte Harold and W. Scott Means (O'Reilly, 2004), provides a somewhat more concise introduction.
Finally, if you want the most comprehensive coverage available, I recommend Michael Kay's XSLT: