XSLT Elements
In this section, we will recap all the elements we have used
so far, and meet several others. With these you will be able to create the vast
majority of the XSLT stylesheets you might want. In the next chapter, we will
meet more elements; some of these you will require less often, while others
provide more advanced functionality.
We will be looking at the most frequently used aspects of
these elements. For fuller descriptions refer either to the XSLT recommendation
or to a source such as the XSLT
Programmer's Reference 2nd Edition (ISBN 1-861005-06-7 from Wrox Press).
<xsl:stylesheet>
This is simply the container element for all other elements
within an XSL stylesheet. In most cases, this means it will be the document
element of a stylesheet document.
A stylesheet can be embedded within another document, in
which case an id
attribute of this element can be used to allow a reference to the stylesheet.
The <xsl:stylesheet>
element must
contain a version
attribute, indicating the version of XSLT that is being used. Currently, this
is always 1.0 or 1.1.
The most
important change between version 1.0 and version 1.1 of XSLT is that the latter
allows multiple output documents to be created from a single XML source
document and stylesheet. The full list of changes is documented as an appendix
to the XSLT 1.1 specification (http://www.w3.org/TR/xslt11), which is at working draft stage at the time of
writing.
The element is also likely to contain several namespaces, as
we saw above. Firstly, there will be the XSLT namespace itself to tell the XSL
processor which elements to process and which to pass unchanged to the output
tree. Then there might be a namespace for XSL formatting objects (which we will
look at in Chapter 9), namespaces for elements and attributes we will be
matching in the source document, and namespaces for elements we might be
creating in the output document. For example, if we wanted to use a stylesheet
to document an XML Schema document, producing HTML output, our xsl:stylesheet
start tag might look like this:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsd="http://www.w3.org/1999/XMLSchema"
xmlns="http://www.w3.org/TR/REC-html40">
In this case, we have defined HTML as our default namespace,
and used explicit qualifiers for the XSLT and XML Schema namespaces.
Other optional attributes of the <xsl:stylesheet>
element relate to namespace prefixes in the result tree and extension elements
(which we will meet later in the book).
<xsl:output>
This element is used to inform the XSL processor of the format
of the result tree. Earlier we used the
following example (count.xsl) without using <xsl:output>:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="PLAY">
<HTML>
<HEAD>
<TITLE>Counting</TITLE>
</HEAD>
<BODY>
...
</BODY>
</HTML>
</xsl:template>
</xsl:stylesheet>
We used various HTML elements directly in our stylesheet,
and these were copied directly to the result tree. Since the stylesheet is an
XML
document, the HTML included in it must itself be well-formed XML, and hence the
HTML copied to the result tree will also be well-formed XML. This well-formed
XML could meet the rules of the Extensible Hypertext Markup language, XHTML
(see Beginning XHTML, ISBN
1-861003-43-9), but this is not essential to the operation of the stylesheet.
If the transformed result tree is being saved as a file, the
last stage of XSLT processing will be to serialize the result tree. It seems
reasonable to assume that this will also be well-formed XML. In most cases,
this would not cause problems, but some HTML browsers (particularly older ones)
have difficulty with constructs such as <HR/>,
preferring just the
opening tag <HR>
without a closing tag for a horizontal rule. HTML also allows attributes
without values (as in <OPTION selected>).
Again, this is not
well-formed XML, but some browsers will object to the alternative form of <OPTION
selected="selected">.
For this reason, alternative forms of serialization are supported through the <xsl:output>
element. When, as in the example above, the element is omitted, the serialized
output will obey rules we will look at later. In this case, it would be XML.
This element has an optional method
attribute that specifies the form that the serialization should take. The three
possible values are xml, html
and text.
The xml
option is simple enough the serialized output will be well-formed XML. The html
option handles the cases shown above by converting the tags to the more normal
HTML styles of <HR>
and <OPTION
selected>,
and the text
method provides a pure text output, removing all tags and converting entity and
character references to their text equivalents.
Let's take a simple stylesheet, output.xsl,
that creates some HTML, and try the different forms of <xsl:output>.
Since we now
know more about the use of namespaces in XSL, we will also declare the
unqualified
names copied to the output to be in the HTML namespace.
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/TR/REC-html40">
<xsl:output method="html"
indent="yes"/>
<xsl:template match="/">
<HTML>
<HEAD><TITLE>Testing the xsl:output
element</TITLE></HEAD>
<BODY>
<P>
This is a simple stylesheet to show the effect of the xsl:output
element.
There is an <HR/>
element after this line.
</P>
<HR/>
<SELECT>
<OPTION value="1">First option</OPTION>
<OPTION selected="selected" value="2">Second
(selected) option</OPTION>
<OPTION value="3">Third option</OPTION>
</SELECT>
</BODY>
</HTML>
</xsl:template>
</xsl:stylesheet>
In general in this chapter, we will leave out the HTML
namespace declaration for simplicity.
We have specified our
output method as html.
We have also specified that we want the result indented, by putting in the
attribute indent="yes".
Although the XSLT recommendation does not specify what action the XSL processor
should take as a result of this, with some processors it can make the result
more readable.
What source XML shall we apply
this to? The answer is any! This is the ultimate pull model stylesheet it
totally ignores the input XML, creating its result tree based
only on the content of the stylesheet. Since we have some XML we used earlier,
we can run the stylesheet using:
xt hamlet.xml output.xsl output.htm
We are not really interested in what the resulting HTML
looks like when rendered. What we want to see is the HTML code produced (output.htm):
<HTML
xmlns="http://www.w3.org/TR/REC-html40">
<HEAD>
<TITLE>Testing the xsl:output
element</TITLE>
</HEAD>
<BODY>
<P>
This is a simple stylesheet to show the effect of the xsl:output
element.
There is an <HR/> element after this line.
</P>
<HR>
<SELECT><OPTION
value="1">First option</OPTION><OPTION selected
value="2">Second (selected) option</OPTION><OPTION
value="3">Third option</OPTION></SELECT>
</BODY>
</HTML>
Note that this is "traditional" HTML. If we want
to create XHTML, which is well-formed XML, we should use an output method of xml
since the changes made by the html
method do not create a
well-formed XML document. We would also have to make other changes to the
stylesheet, such as changing all element names to lower case.
Now, changing just one line of output.xsl
will switch the output method to xml:
<xsl:output method="xml"
indent="yes"/>
Save the file as output2.xsl.
The result of using this
stylesheet is that the <HR>
element is changed to the
empty tag syntax and the selected
attribute of the <OPTION>
element is given its attribute value:
<?xml version="1.0"
encoding="utf-8"?>
<HTML
xmlns="http://www.w3.org/TR/REC-html40">
<HEAD>
<TITLE>Testing the xsl:output
element</TITLE>
</HEAD>
<BODY>
<P>
This is a simple stylesheet to show the effect of the xsl:output
element.
There is an <HR/> element after this line.
</P>
<HR/>
<SELECT>
<OPTION value="1">First
option</OPTION>
<OPTION selected="selected"
value="2">Second (selected) option</OPTION>
<OPTION value="3">Third
option</OPTION>
</SELECT>
</BODY>
</HTML>
Note that, although I have specified my output as XML, this
does not mean I am producing XHTML. That is up to me to control, by obeying
XHTML rules such as using lower case for all tag names.
With XT, the indent
attribute did not affect the
HTML output, but it improved the layout of the XML version. With any XSL
processor, it is best to experiment to see the difference this attribute makes.
Finally, let's make one more change to our output.xsl:
<xsl:output method="text"
indent="yes"/>
Save the file as output3.xsl.
Our result when using XT
now looks like this:
Testing the xsl:output element
This is a simple stylesheet to show the effect of the xsl:output
element.
There is an <HR/> element after this line.
First optionSecond (selected) optionThird option
All the tags have now been removed, and the entity
references <
and >
have now been replaced by their corresponding characters. This is clearly
important if we are, for example, using our stylesheet to create a
comma-separated file from an XML input document. Another attribute of <xsl:output>
that helps under these circumstances is the encoding
attribute. This allows us to
specify a character set such as iso-8859-1
for our output. Any
character outside this set will cause an error to be reported.
Earlier, we were using XSLT without the <xsl:output>
element and creating well-formed XML. This is normally the default, but if the
result tree meets all of the following three criteria, the serialized output
will be HTML by default:
q
the root node has at least one element child
q
the expanded-name of the first element child of the
root node of the result tree has local part html
(in any combination of upper and lower case) and a null namespace URI
q
any text nodes preceding the first element child of the
root node of the result tree contain only whitespace characters
Other attributes of <xsl:output>
provide much more
control over the output. They can:
q
define the version of XML or HTML being created
q
control aspects of the XML declaration
q
indicate the SYSTEM and PUBLIC identifiers of the
DOCTYPE
q
control the MIME type of the output
q
control how CDATA sections are handled
These are described
in detail in the XSLT recommendation and in reference books such as the XSLT
Programmer's Reference.