Transforming XML with XSLT
- Transforming XML with XSLT
- Beginning an XSLT Style Sheet
- Creating the Root Template
- Outputting HTML Code
- Outputting a Node's Content
The complete and official proposal for transforming and formatting XML documents was originally to be contained in a specification called XSL, which stands for Extensible Style Language. However, because it was taking so long to finalize, the W3C divided XSL into two pieces: XSLT (for Transformation) and XSL-FO (for Formatting Objects).
This feature explains how to use XSLT to transform XML documents. The end result might be another XML document or more commonly an HTML document for viewing in both the newest and the not-so-new browsers. Transforming an XML document means analyzing its contents and taking certain actions depending on what elements are found. You can use XSLT to reorder the output according to a specific criteria, to only display certain pieces of information, and much more.
XSLT is often used in conjunction with the much more widely known CSS, or Cascading Style Sheets, which handles the actual formatting.
Transforming XML with XSLT
Let's start with an overview of the whole transformation process.
To perform the actual transformation, you'll need an XSLT processor. There are a number of these available online, including Instant Saxon (written by Michael Kay).
The first thing the XSLT processor does is analyze the XML document (Figure 1) and convert it into a node tree (Figure 2). A node is nothing more than one individual piece of the XML document (like an element, an attribute, or some text content). A node tree is a hierarchical representation of the entire XML document.
You can find the full XML document used in the examples in this feature on the book's Web site. I recommend downloading and printing out a copy for easy reference as you go through the examples.
Here is a partial representation of the node tree that corresponds to the XML document shown in Figure 1.
Once the processor has identified the nodes in the source XML, it then looks to a XSLT style sheet for instructions on what to do with those nodes. Those instructions are contained in templates. Each template has two parts: first, a sort of label that identifies the nodes in the XML document that the template can be applied to, and second, instructions about the actual transformation that should take place.
The processor automatically looks for a root template, which it then applies to the root node of the XML document (the one that contains the outermost element). The root template generally contains a combination of literal elements, that should be output as is, and XSLT instructions that output or further process the nodes in the source document.
One special kind of XSLT instruction (xsl:apply-templates) identifies a set of nodes (aptly called a node set) and specifies that those nodes should be processed at that point with the most appropriate template(s) available. Each of these "subtemplates" can include additional xsl:apply-templates instructions that point to other subtemplates. This lets you control the order (and manner) in which the contents of the source document are processed and output.
You identify and select node sets and their corresponding templates by using expressions and patterns, respectively.
The transformed data is then either displayed or saved to another file.
While you can use XSLT to convert almost any kind of document into almost any other kind of document, that's a pretty vague topic to tackle. In this book, I'll focus on using XSLT to convert XML into HTML. This lets you use the strengths and flexibility of XML for handling your data and the compatibility of HTML so that visitors to your site can actually access that data.
The complete XSLT style sheet can also be found on the book's Web site and should also be downloaded and printed for your convenience.