Articles
DevASP - ASP and XML Articles, Samples, Toturials, Sample Chapters and resources for Developers Monday, September 01, 2014
Home
Articles & Samples
Dev Search
Sample Chapters
Link to US
Contact
Search Directory
Applications
Articles & Samples
Components
Community
Database
Developer Sites
Downloads
Hosting Services
Introduction
Knowledge Base
Sample Chapters
WebCasts
ASP Directory
Applications
Articles & Samples
Components
Developer Sites
Knowledge Base
Sample Chapters
WebCasts
XML Directory
Applications
Articles & Samples
Developer Sites
Error, Bugs & Fixes
Downloads
Introduction
Knowledge Base
Sample Chapters
WebCasts

The XSLT Processing Model


Although we often talk of an XSLT processor as something that turns one XML document into another (or into an HTML or text document), this is not strictly true. The specification actually talks in terms of a source tree (or input tree) and a result tree. There is therefore an assumption that, for example, if we are starting from a text document rather than an existing DOM tree, it has been turned into some sort of tree structure before the XSLT processor starts its work, and that the result tree will be used for further processing or serialized in some way to create another text document.

 

The model, including formatting, therefore looks like this:

 

 

This concept is simple enough. But you will have read in Chapter 1 that XSLT is a declarative language and uses templates. How does this work in practice? Let's have a look at a simple XML document and stylesheet, and walk through the processing.


Processing a Document

Here is my XML document – it is the book catalog that you will be familiar with if you have read Professional XML (Wrox Press, ISBN 1-861003-11-0), although I have cut it down to just two books, removed some elements and renamed it shortcatalog.xml:

 

<?xml version="1.0" encoding="utf-8" standalone="yes"?>

<Catalog>

<Book>

<Title>Designing Distributed Applications</Title>

<Authors>

<Author>Stephen Mohr</Author>

</Authors>

<PubDate>May 1999</PubDate>

<ISBN>1-861002-27-0</ISBN>

<Price>$49.99</Price>

</Book>

<Book>

<Title>Professional ASP 3.0</Title>

<Authors>

<Author>Alex Homer</Author>

<Author>Brian Francis</Author>

<Author>David Sussman</Author>

</Authors>

<PubDate>October 1999</PubDate>

<ISBN>1-861002-61-0</ISBN>

<Price>$59.99</Price>

</Book>

</Catalog>

 

We'll look at the XSLT stylesheet we use to transform this document shortly, but let's now become an XSLT processor and see what happens. We already know that, as an XSLT processor, we cannot use the source XML, but need a tree representation based on the structure and content of the document. So here it is:

 


Each node is described by a block of three rectangles. In the top rectangle is the node type, with the node name in the rectangle below it. The bottom rectangle contains an asterisk if the node has element content, and the text if it has text content.

 

At the top of the tree is the root node or document root. Don't confuse this with the root element (or document element) familiar from XML. The document root is the base of the document, and has the document element (<Catalog>) as a child. It also has the XML declaration and any other top-level nodes (which might be comments or processing instructions) as children. The document element contains two child <Book> elements, and these hold the information about the books.

 

So now we have the tree structure, we can start to populate and process it. This is the processing model we will use:

 

 

Before XSL processing starts, both the source document and XSLT stylesheet must be loaded into the processor's memory. How this happens is dependent on the implementation. One option is that both are loaded as DOM documents under the control of a program. Another option is that the stylesheet is referenced by a processing instruction in the source XML document. IE5 can operate in this way, and will automatically load the stylesheet when the XML document is loaded.

 

And here is the XSLT stylesheet (TitleAndDate.xsl) we will use to process the shortcatalog.xml to get a new XML document listing just the titles of the books and their publication dates:

 

<?xml version="1.0" encoding="utf-8" standalone="yes"?>

<xsl:stylesheet

version="1.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 

<xsl:template match="/">

<xsl:apply-templates/>

</xsl:template>


<xsl:template match="Catalog">

<Books>

<xsl:apply-templates/>

</Books>

</xsl:template>

 

<xsl:template match="Book">

<Book>

<xsl:value-of select="Title"/>, <xsl:value-of select="PubDate"/>

</Book>

</xsl:template>

 

</xsl:stylesheet>

 

Once the documents are in memory, we can start our processing. The XSL processor starts by reading the template for the document root from the stylesheet (step 1). Here is that template:

 

<xsl:template match="/">

<xsl:apply-templates/>

</xsl:template>

 

The first line indicates that it is a template, with a match attribute to indicate the node or nodes it is matching. The attribute value is an XPath expression, in this case just being the / to indicate the document root.

 

Working round the diagram, at step 2 we find the source node (strictly, the node-set, but here it will comprise a single node) in the source tree that the template matches. This will be the document root. The second line of the template moves us on to step 3 and indicates that we will execute whatever templates apply to the children of this node. The document root has two children – the XML declaration and the <Catalog> element.

 

Looking through the stylesheet, there is no template for the XML declaration (XSLT does not give us access to this node), but there is one for the <Catalog> element. Processing a document using XSL is a recursive process, and we are now back to step 1 with a new template. Here is the template:

 

<xsl:template match="Catalog">

<Books>

<xsl:apply-templates/>

</Books>

</xsl:template>

 

This contains some text, which looks like another element called <Books>. As our diagram indicates, we will transform this into a result node at step 3. It also contains an <xsl:apply-templates/> instruction, so we will again look for templates to execute matching the child nodes.

 

The only children of the <Catalog> element are the two <Book> elements, so we will read the template for these elements and go round the circle again. Here is the template:

 

<xsl:template match="Book">

<Book>

<xsl:value-of select="Title"/>, <xsl:value-of select="PubDate"/>

</Book>

</xsl:template>


This time, for each <Book> element we are creating a <Book> element in the result tree. Into this, we are placing the value of the <Title> element, then some literal text comprising a comma and a space, then the value of the <PubDate> element.

 

Note that the value of an element in XSLT is not the same as with the Document Object Model (DOM). With the DOM, the value of an element is always null, while in XSLT it is the text between the start and end tags.

 

At this point we stop since we have no more <xsl:apply-templates/> elements. This means that no other elements in the source document will get processed, but then that's what we wanted.

 

So how are we constructing the result tree? Let's work this one from the bottom up. When we execute the template for <Book>, we create the new <Book> element, and then replace the line:

 

<xsl:value-of select="Title"/>, <xsl:value-of select="PubDate"/>

 

with the result of evaluating the statements. For the first book, that will be:

 

Designing Distributed Applications, May 1999

 

So overall, our result node will look like:

 

<Book>Designing Distributed Applications, May 1999</Book>

 

Since we have two <Book> elements in the source tree, we will get two <Book> elements in the result tree:

 

<Book>Designing Distributed Applications, May 1999</Book>

<Book>Professional ASP 3.0, October 1999</Book>

 

Similarly, in the template for <Catalog>, we will replace the line:

 

<xsl:apply-templates/>

 

with the results generated by executing the instruction. This will put the two <Book> elements we have created inside a <Books> element. The result tree now looks like this:

 

<Books>

<Book>Designing Distributed Applications, May 1999</Book>

<Book>Professional ASP 3.0, October 1999</Book>

</Books>

 

I have added the line breaks and formatting to make the output look better.

 

Moving back to the first template we came across, the one for the document root, we can see that this adds no further content, so our output is exactly as I have just shown.

 

In our processing model, we now break out of our cycle and format the output (step 5). In this case, we have no formatting, so there is no further processing. Later in this book, we will see how we can format using a standard web-browser and HTML, or using the Formatting Objects part of the XSL specification (XSL-FO).


 

Note that the XSLT specification says that "… XSLT is not intended as a completely general-purpose XML transformation language. Rather it is designed primarily for the kinds of transformations that are needed when XSLT is used as part of XSL." However, in the majority of cases, XSLT is used independently of XSL-FO, just as we are doing here and will do again when we produce HTML using XSLT. The specification acknowledges this with the statement "… XSLT is also designed to be used independently of XSL."

 

Using any of the processors described in Appendix E we can run the XSLT stylesheet with the XML. For example, if we now invoke XT with the command line:

 

xt shortcatalog.xml TitleAndDate.xsl TitleAndDate.xml

 

we produce a file TitleAndDate.xml with the content:

 

<?xml version="1.0" encoding="utf-8"?>

<Books>

<Book>Designing Distributed Applications, May 1999</Book>

<Book>Professional ASP 3.0, October 1999</Book>

</Books>

 

XT has put an XML declaration at the top, but otherwise it is exactly as we generated ourselves.



DevASP - Privacy - Disclaimer
Copyright 2008 DevASP.com