Background

I started working on XSLT 1.0 Stylesheets for DocBook well before XSLT 1.0 was a Recommendation. I had worked with DSSSL, one of XSLT’s precursors before that, and a variety of other formatting systems, including one that I wrote myself. I started working on the XSLT 2.0 Stylesheets for DocBook not long before XSLT 2.0 became a Recommendation. I wrote most of DocBook xslTNG (DocBook XSLT Stylesheets: The Next Generation) just a month or so before the third anniversary of the XSLT 3.0 Recommendation.

Why did it take so long?

To answer that question, we need to reflect for a moment on XSLT and its place in the XML ecosystem. When XSLT arrived on the scene, we were near the peak of XML enthusiasm. Not only was XML supported everywhere, it was possible to imagine XSLT everywhere as well. Certainly, the presence of XSLT in the browser felt significant at the time.

The ubiquity of XML and the fact that XSLT was “just an XML vocabulary” may have contributed to another significant phenomenon: lots of users who did not self identify as programmers were learning to use XSLT and doing significant things with it.

There were other tools available for transforming markup at the time, and arguably some of them were better than XSLT, but they were programming languages and you had to be a programmer to use them. They were also mostly commercial applications not widely available to casual users.

XSLT was free, it was everywhere, and it was used by everyone, not “just” programmers. It was the clear winner than and remains the clear winner today in terms of markup transformation.

You could do a lot of things with XSLT 1.0. You could do a lot more things than you might at first even have thought possible. (In fact, you could do all things, but the Turing complete nature of XSLT isn’t relevant here.) Some very common tasks, like grouping, were possible but difficult. Lots of very useful things were either not possible or required extensions: regular expressions, functions, date and time formatting, creating special characters in the output, to name just a few.

XSLT 2.0 solved all of these problems (and more). Significantly, I think, all of these new features appealed directly to almost all users of XSLT 1.0. Everyone had encountered a grouping problem (building an index, for example). Everyone had wanted regular expression matching or date formatting. Lots of users wanted to write more sophisticated predicates (and many were willing to learn how to use functions to achieve that result).

XSLT30 arguably introduces larger and more dramatic features than XSLT 2.0 did. There are a bunch of new features designed to enable streaming processing; there are significant software engineering improvements: packaging, exception handling, and assertions; there are common programming language constructs like maps and arrays. There is also a selection of features inherited from updates to XPath (new functions, a subset of let syntax, and support for higher order functions, for example).

What’s curious, I think, is that many of these features are probably less immediately appealing to many (most?) current users. XSLT 2.0 doesn’t feel constraining in the same way that XSLT 1.0 did, and the features in XSLT 3.0 don’t immediately and obviously solve problems that most users have.

Streaming, for example, is incredibly powerful and it’s an important and significant milestone in markup processing. It makes it possible to solve whole classes of problems that were previously impossible to solve or required enormously expensive hardware. But my laptop will quite easily process a book full of complex markup that runs to hundreds of pages. I don’t have any problems that require a streaming processor.

Likewise, packaging is useful and important. The DocBook xslTNG stylesheets should absolutely be a package. But that’s not true of a lot of stylesheets. There might be software engineering benefit in making a package even for stylesheets that you don’t intend to distribute, but that’s more likely to appeal to people who think of what they’re doing is programming.

Nevertheless, there are lots of good reasons to use XSLT 3.0 even if you are “only” transforming documents and even if you don’t think of writing transformations as programming.

How did I get here?

This story begins, as many stories do, with a bug and a coincidence. The bug is this presentation:

Figure 1: Callouts, badly rendered

which should be more like this:

Figure 2: Callouts, correctly rendered

This bug arises in the DocBook XSLT 2.0 Stylesheets’ failed attempt to process programlistingco, an element with quite complex semantics.

The coincidence is that just a few days before I found this bug, I had been thinking about whether or not it was time to consider upgrading the DocBook stylesheets that I maintain to XSLT 3.0 (and specifically, what I should call them if I did that since putting “XSLT 2.0” in the name had some pretty significant implications).

It had been a long while since I worked on programlistingco in the XSLT 2.0 stylesheets, but having some idea of how tricky it was to implement gave rise to the question, “would it be much easier in XSLT 3.0?” After a brief exploration, I concluded that the answer was “yes”. With one foot solidly down the slippery slope, I began to explore other questions. Before long, I was undertaking to reimplement the entire stylesheet from scratch in XSLT 3.0.

The plan

The DocBook XSLT 1.0 stylesheets grew organically over many years and from DSSSL stylesheets that preceded them. They support a wide range of output formats, some now moribund, and have hundreds of parameters.

The DocBook XSLT 2.0 stylesheets very definitely started as an attempt to upgrade the XSLT 1.0 stylesheets. Although some simplification was possible, a good deal of complexity was carried forward. In principle the goal was to produce both HTML and XSL FO, although the XSL FO stylesheets never really got the attention they needed.

The DocBook xslTNG stylesheets are a complete rewrite, mostly from scratch, with the following goals:

  • A full set of tests

  • A full set of documentation

  • Designed for HTML5 on modern browsers

  • Designed for accessibility

  • Paged media output through HTML+CSS with customization and/or post-processing

  • EPUB output through customization and/or post-processing

The paper

Despite having served on the XSLT Working Group at the W3C and having read and reviewed the specification countless times, it had been about three years since I thought about XSLT 3.0. There’s also an enormous gulf between reading a specification and actually writing in the language it specifies.

In my mind, XSLT 3.0 was an incremental improvement on XSLT 2.0. Its big ticket items (streaming, packaging, maps, arrays, higher-order functions) were cool, but they didn’t seem immediately useful in “ordinary” XSLT use cases.

As I started working on the new stylesheets, I kept coming across features that made doing ordinary XSLT easier and better. By the time I’d come across a half-a-dozen or so of these features (large and small), I was firmly convinced that it was time to embrace XSLT 3.0 wholeheartedly.

This paper sets out to describe the features I found and hopes to persuade you that XSLT 3.0 is something you should embrace now, if you haven’t already. This paper does not attempt to provide a comprehensive survey of XSLT 3.0 features: I’ve specifically chosen the features that seemed most immediately applicable to transforming an “ordinary” markup vocabulary.

I’ve tried to organize the features in order of increasing complexity, but what seems simple and what seems complex will vary depending on the reader’s background. A passing familiarity with XSLT 2.0 features (functions, in particular) is assumed.

As noted above, XSLT 1.0 is Turing complete. Nothing described here as an XSLT 3.0 feature is impossible to achieve with XSLT 2.0 (or even 1.0). Some of the features may even seem “obvious” to the reader. That’s ok. The goal is to present the surface area of XSLT 3.0 as useful and inviting.

Value templates

If you’ve used XSLT at all, you’ve almost certainly used attribute value templates. That’s the feature that allows you to put an expression in curly braces in an attribute value and have that expression evaluated by the processor:

<xsl:template match="someElement">
  <span class="{local-name(.)}">
    <xsl:value-of select="3+4"/>
  </span>
</xsl:template>

That template will produce “<span class="someElement">7</span>”. XSLT 3.0 allows value templates to appear in text content as well. There’s a flag, [xsl:]expand-text, to control whether or not you want this behavior:

<xsl:template match="someElement" expand-text="yes">
  <span class="{local-name(.)}">{3+4}</span>
</xsl:template>

In that template, the expression “3+4” in curly braces will also be evaluated and the string value of the result inserted into the result tree.

Better debugging

Developing software in an interactive IDE may be the easiest way to debug it, but eventually your software runs “in the wild.” One common approach for debugging outside an IDE is to add xsl:message statements that print out useful debugging information:

<xsl:template match="*" mode="someMode">
  <xsl:param name="option" select="()"/>

  <xsl:message>
    <xsl:value-of select="local-name(.)"/>
    <xsl:text> </xsl:text>
    <xsl:value-of select="$option"/>
  </xsl:message>

  <xsl:text>Transformation</xsl:text>
</xsl:template>

That’s fine, except if you do that a lot, you end up with a lot of messages. And if you’re providing stylesheets to other users, they may find the debugging messages confusing or even intimidating.

How many readers have stylesheet that looks like this?

<xsl:template match="*" mode="someMode">
  <xsl:param name="option" select="()"/>

  <!--
  <xsl:message>
    <xsl:value-of select="local-name(.)"/>
    <xsl:text> </xsl:text>
    <xsl:value-of select="$option"/>
  </xsl:message>
  -->

  <xsl:text>Transformation</xsl:text>
</xsl:template>

That’s fine too, except that the stylesheet has to be edited to enable the debugging messages when something goes wrong.

Static parameters offer a much cleaner and nicer solution. Declare a static top-level “debug” parameter:

<xsl:param name="debug" select="''" static="yes"/>

Then in your template, you can use use-when:

<xsl:template match="*" mode="someMode">
  <xsl:param name="option" select="()"/>

  <xsl:message use-when="$debug = 'someMode'">
    <xsl:value-of select="local-name(.)"/>
    <xsl:text> </xsl:text>
    <xsl:value-of select="$option"/>
  </xsl:message>

  <xsl:text>Transformation</xsl:text>
</xsl:template>

If the $debug parameter isn’t “someMode”, the XSLT compiler will discard that message, it won’t incur any runtime overhead or potentially introduce any sorts of errors.

But if you set the $debug parameter to “someMode” when you run (technically, compile) the stylesheet, then you’ll get debugging output. This is a significant improvement over comments and a performance and correctness improvement over using xsl:if to evaluate the test conditions dynamically every time.

By declaring a parameter (or variable) static, you’re asserting that its value can be determined without reference to the source document. This means you can’t, for example, use this technique to enable debugging only in documents that satisfy some XPath expression. For that, you’ll still have to use dynamic tests.

The DocBook xslTNG stylesheets use this technique frequently, defining a whole list of potential debug flags that can be enabled for a particular run.

Better messages

In XSLT 3.0, the xsl:message instruction has a select attribute. The message example in the preceding section can be further simplified to:

<xsl:message use-when="$debug = 'someMode'"
             select="local-name(.) || ' ' || $option"/>

Notice also, || as a string concatenation operator. In this context, we could have used commas because a sequence of values is fine, but in other places, you’ll find || a significant convenience over concat().

Exception handling

Errors, as the popular expression observes, happen. Dealing with them can be tedious and introduces complexity that may obscure the function of our code, introduce errors, or both.

Consider a stylesheet that reads some optional configuration from an external file. It may be that in the overwhelming majority of cases, the file exists and simply reading it will succeed:

<xsl:variable name="accounts"
              select="doc('accounts-' || @id || '.xml')"/>

But on the rare occasion when the file does not exist, the stylesheet will fail and processing will stop. To avoid this, we have to check if the file exists before we attempt to read it:

<xsl:variable name="acct-table">
  <xsl:choose>
    <xsl:when test="doc-available('accounts-' || @id || '.xml')">
      <xsl:sequence
          select="doc('accounts-' || @id || '.xml')"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:document/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:variable>

The new try/catch mechanism gives us a better approach. The semantics of try/catch are that the processor attempts to evaluate the code in the “try”. If it succeeds, that’s the result of the try/catch. If that code raises an error, the error is ignored and the following “catch” (or catches) are attempted. If one of them succeeds, that’s the result of the try/catch. (If none succeed or no relevant catches were present, the whole try/catch fails and its error propagates to where it was called.)

(The name “catch” arises from the metaphor of errors (that is to say, exceptions to normal processing) being “thrown”. “Thrown,” in turn, arises from the fact, as we’ll see, that exception handling may be moved quite a distance from the location where the error arises.)

Here’s the try/catch version:

<xsl:variable name="acct-table" as="document-node()?">
  <xsl:try>
    <xsl:sequence select="doc('accounts-' || @id || '.xml')"/>
    <xsl:catch errors="err:FODC0002" select="()"/>
  </xsl:try>
</xsl:variable>

There’s less redundancy in the code, so fewer opportunities for error, and less processing in the normal case where the document exists.

Note also the use of an error code on the xsl:catch. Error err:FODC0002 is the error code for “file not found”. What this means is that the try/catch will successfully recover from a missing file but will still raise an error if some other problem arises (such as a permissions problem on the file).

You can provide multiple xsl:catch instructions for different error codes. Best practice is to catch the specific errors that you are anticipating. Overly broad catch instructions can obscure bugs later on.

Raise exceptions

Not only can you catch exceptions, you can raise them. In fact, the ability to raise them has existed since XPath 1.0, but it’s much more useful now. In the context of the DocBook stylesheets, for example, this can arise in processing CALS tables (TR9502).

Tables are complex structures and errors can arise that aren’t easily captured during validation with either grammar or rule based validators. Where previous versions of the stylesheets simply threw up their hands with an xsl:message that terminated the stylesheet, the DocBook xslTNG stylesheets raise an exception.

In ordinary usage, this has much the same effect. If you run the stylesheets directly, the exception won’t be caught and processing will terminate.

But consider the case where the stylesheets are part of a larger work flow. Perhaps you’re building a system that transforms Word documents into XML and then further transforms them in some way. If you’re relying on the DocBook stylesheets for part of the table processing, the fact that table processing raises an exception means that you can use try/catch to detect and potentially recover from the errors.

The DocBook xslTNG stylesheets define a collection of standard error codes in an errors namespace so that users can predict what errors might occur.

Default modes

Modes allow a stylesheet writer to process elements in different ways. One common use is to process some content in different ways. The chapter and section hierarchy, for example, appears in both the main, narrative flow of the document and in the table of contents. That can be accomplished with modes.

In the case of DocBook, there are also a number of elements that need to be processed differently depending on stylesheet options and sometimes document content. The function synopsis elements, for example, can be rendered in “K&R style,” the format used by Kernighan and Richie in their original documentation for the C programming language, or in “ANSI style” which is a slightly different presentation of the same information. Modes can be used here as well.

Modes are very easy to use: you simply put a mode attribute on the templates in that mode and a mode attribute on the xsl:apply-templates that calls them.

Except for that one time where you leave off a mode attribute and either the template is in the default mode or the xsl:apply-templates jumps you back into the default mode.

Using default-mode on the xsl:stylesheet element (or xsl:transform element, if you prefer) makes that mode the default mode for the scope of that stylesheet. Put all your table-of-contents processing in toc.xsl, set the default mode, and never worry again about forgetting a mode attribute.

It’s worth mentioning that this can lead to the opposite problem: failing to place a mode attribute where you need one. In particular, if you customize a stylesheet that uses a default mode (by importing it into your stylesheet), you either need to use the same default mode in your stylesheet or remember to add mode attributes to the templates you’re overriding. It took me a good few minutes to work that out the first time I made that mistake.

Evaluate XPath expressions dyamically

The XPath expressions that you write in your stylesheet (in select attributes, in match attributes, etc.) are evaluated by the processor. You can use variables and functions in those expressions to introduce a degree of flexibility, but the expressions themselves are determined at compile time.

The xsl:evaluate instruction allows you to construct an XPath expression at runtime and evaluate it. This turned out to be useful in lots of different places in the DocBook stylesheets.

There are lots of different ways to format a document and the DocBook stylesheets have always tried to be flexible. In the early days of XSLT 1.0, when XSLT experience was uncommon, making a stylesheet option or parameter to control some aspect of behavior put it within the reach of users who weren’t prepared to write their own custom driver stylesheet with a few override templates.

XSLT experience is a lot more common now, but there’s still a desire to make the amount of customization necessary as small as is practical.

One feature of the stylesheets that always pushed the limits in this regard is the ability to break a document into different files or “chunks”. Instead of producing a single, large HTML document for a book, we might wish to produce a small web of documents linked together.

Lots (and lots) of options would be necessary to cover even a subset of the possible behaviors: chunk preface, chapter, appendix? Chunk sections? To how many levels? Chunk articles? Chunk parts? Chunk reference pages?

Even assuming you could cover a substantial subset of the problem space with options, and assuming the relationships between the options is comprehensible, invariably special cases arise: chunk on first-level sections, unless the chapter contains only a single section, in which case keep that section in the chapter chunk.

Practically speaking, the stylesheets have to stop adding options at some point and push the burden onto the user to write a stylesheet that answers the chunking questions. This is a doubly burdensome on the user because not only does it require moving from the “I just have to set options” level of skill to the “I have to write templates” level of skill, the templates that need to be written aren’t simple. They have to fit into the intricate framework that determines chunk boundaries.

Evaluating XPath expressions at runtime greatly simplifies this problem. Now we can say there are two parameters: the first is a list of XPath expressions that identify what elements are included in chunks. The second is a list of XPath expressions that identify what elements to exclude.

The default values for the DocBook xslTNG parameters look (roughly) like this:

<xsl:param name="chunk-include" as="xs:string*"
           select="('parent::db:set',
                    'parent::db:book',
                    'parent::db:part',
                    'parent::db:reference',
                    'self::db:section')"/>

<xsl:param name="chunk-exclude" as="xs:string*"
           select="('self::db:partintro',
                    'self::*[ancestor::db:partintro]',
    'self::db:section
        [parent::db:chapter
         and not(preceding-sibling::db:section)
         and not(following-sibling::db:section)]'"/>

The first parameter define all of the children of set, book, part, and reference as chunks and all section elements as chunks. The second parameter makes an exception for the partintro element (which is a child of part) and any of its descendants, and any section of a chapter if it’s the only section.

You may be wondering how this addresses the question of chunking at multiple levels of section and how much complexity that introduces. After all, while it may be easier to write these parameters than it is write a stylesheet module, it still requires a fairly solid understanding of XPath and the structure of DocBook.

The short answers are: “it doesn’t” and “quite a bit”. Determining the level of section at which to chunk is so common, and it would introduce significant complexity in the patterns, so there’s still a simple $chunk-section-depth parameter to handle that.

Other places where it’s convenient in the DocBook stylesheets to use xsl:evaluate include formatting title pages, formatting titles, and formatting cross-references to titles,

Parse XML and JSON

Like dynamic XPath evaluation, the ability to parse XML and JSON dynamically can be useful. In the context of the DocBook xslTNG, this is used for syntax highlighting program listings.

The stylesheets use an external program, Pygments, to add syntax highlighting to program listings that have a language attribute. If a program listing claims to be C source code (or Python or XML or any of a very wide variety of other languages), the listing is sent off to Pygments for highlighting.

Pygments returns the listing decorated with inline HTML markup and classes that add colors to strings, literals, keywords, variables, etc. Or it would if shipping markup around was a first class operation. What it actually returns is a bunch of text that happens to have angle brackets in the right places.

The XPath parse-xml function means the stylesheets can interpret that markup without relying on an extension function to parse it. The same would be true of externally generated JSON or markup extracted from a quoted string somewhere, for example.

Performance

XSLT 3.0 introduces caching, a means by which the stylesheet author can identify functions which would benefit from being evaluated only once. Consider:

<xsl:function name="lookup-tag" cache="yes">
  <xsl:param name="tag" as="xs:string"/>
  …
</xsl:function>

Enabling caching is an assertion on the stylesheet author’s part that the result of the function depends solely on its parameters, and that if the processor has calculated the return value for a particular set of parameters once it can return that value immediately if the function is called again with the same set of parameters. Critically: it does not have to evaluate the body of the function a second time.

The reference documentation for DocBook, DocBook: The Definitive Guide, contains many (many, many) uses of the tag element. For example:

<para>Paragraphs of prose in DocBook are identified
with the <tag>para</tag> tag unlike the more familiar
<tag>p<tag> tag of HTML.</para>

The formatting expectation is that the word “para” will become a link to the reference page for the para element. It isn’t explicitly authored as a link because it would have been incredibly tedious to do so. As you can see, tag is used for its semantic purpose (this is a tag in a markup vocabulary) and the special processing for DocBook elements only applies to some uses.

This means that every time the stylesheets encounter a tag element they have to determine if the named element is a DocBook element. That isn’t a difficult operation, but a little profiling revealed that it was being performed almost 300,000 times (There are more than 63,000 occurrences of tag in the book.)

Simply adding cache="yes" to the lookup function reduced processing time by a factor of four. Formatting the book used to take almost 20 minutes, now it takes less than five.

Maps and Arrays

XSLT 3.0 introduces two common programming language features: maps (or dictionaries or hashes: they go by a variety of names) and arrays. If you’re already familiar with them from some other programming language, there are no surprises here. The good news, if you don’t think of what you do with XSLT as programming, is that most XSLT users already have some experience with map-like and array-like structures, even if they never thought of them in those terms.

Maps have a lot in common, at least conceptually, with keys. Given a key, they return the value associated with that key. If you’ve used xsl:key, you’re ready to use maps. Whereas xsl:key only allows you to lookup nodes in a document, maps allow you to construct arbitrary key/value pairs.

Arrays have a lot in common, again, at least conceptually, with sequences. One of the significant distinctions between arrays and sequences is that arrays can be nested. You can put an array inside an array and it isn’t collapsed into a single array the way a sequence collapses into another sequence. You can put arrays inside arrays (inside arrays, if you wish) to make two and three and higher dimensional structures.

You can even combine the two and it is often useful to do so: you can have an array of maps and you can have an array as the value of a key in a map.

Solving programlistingco

This brings me back to the bug that started it all. The element that the XSLT 2.0 stylesheets were failing to process is called programlistingco: program listing with callouts. Here’s what it looks like:

<programlistingco>
<areaspec>
<area xml:id="gs1-d1" coords="4 50" units="linecolumn"/>
<area xml:id="gs1-n1" coords="6 50" units="linecolumn"/>
</areaspec>

<programlisting
><xi:include parse="text" href="examples/custlayer.rnc"/
></programlisting>

<calloutlist>
<callout arearefs="gs1-d1" xml:id="list_gs1-d1">
  <para>Start by importing the base DocBook schema.</para>
</callout>
<callout arearefs="gs1-n1" xml:id="list_gs1-n1">
  <para>Then you can add new patterns or augment existing
  patterns.</para>
</callout>
</calloutlist>
</programlistingco>

The critical observation here is the callout marks, ① and ②, are applied to the program listing after it’s loaded from an external file with XInclude. The program listing doesn’t contain any markup and can be run and validated as a working example independent of its use in the document. This is very powerful, if a considerable challenge to implement.

The coords describe where the marks should go. There are various options for the marks, but Unicode characters are the default. To place the “①” on line 4 at column 50, the processor has to break the listing into lines and then break the lines into characters. It then has to find the 49th character, adding extra blanks to the line if necessary, and insert the callout (which might involve markup, such as an image) into the line. It then has to repeat the process for the next callout.

This is, in principle, a problem that can be solved with sequences, but it’s much easier with arrays. Part of the problem with using sequences has to do with the fact that sequences don’t nest: you can’t, for example, use the empty sequence to mark a special point and you can’t put two consecutive items in the sequence without putting a wrapper around them so that they’re a single item.

If you didn’t have arrays, or didn’t want to use arrays, and you wanted to avoid the complexities involved in using sequences, one of the first ideas you might have is to use markup. It’s not too difficult to see that “listing” document containing “line” elements containing “char” children could represent the lines and columns with complete fidelity. In fact, the power of XPath would make navigating around in the XML structure even easier than using arrays.

The big problem with that approach is node identity. If you stick the nodes in another tree structure, they aren’t the same nodes you started with. This can be problematic if you want to use key() to find them or if you want to test their ancestors. Maps and arrays preserve the identity of the nodes you put in them.

The other place where this kind of array-based processing really shines is another really complicated bit of markup: CALS tables. The DocBook xslTNG stylesheets take a two-pass approach to processing tables. The first pass decomposes all of the complexities of rows and spans into a flat array of arrays. The second pass processes the markup inside the cells. Node identity is even more critical here as there are more likely to be ID/IDREF links to other parts of the document and there are rich structures like footnotes and nested tables to be handled.

xsl:iterate

The xsl:iterate instruction was a late addition to this paper. If you’ve worked on substantial XSLT 2.0 stylesheets, you’ve probably already worked out how to accomplish what xsl:iterate does without it. On the other hand, if recursive functions are something you’ve struggled to understand, you are likely to be very pleased.

In most (non-functional) programming languages, dealing with a task like searching for a value in a list or iterating over a sequence until some condition occurs are handled with loops and mutable variables. XSLT doesn’t have mutable variables. Variables in XSLT are “single assignment”: a single value is assigned to them when they’re created and that can never be changed.

Luckily, we have a very powerful query language, XPath, and so we can very often make selections without ever explicitly iterating over a list. But sometimes the operation you need to perform is too complicated to express in XPath or requires operations that XPath queries can’t perform.

The traditional answer to this kind of problem in functional programming languages is recursive functions (ignoring special cases where map and fold operations will suffice).

The xsl:iterate instruction basically takes the structure of this particular kind of recursive function and expresses it declaratively.

In DocBook, numbered lists can be nested and those nested lists can be mixed together with other elements (so a list item might consist of two paragraphs followed by a sub-list).

Consider the problem of working out the numeration for a list item. If the context list item is a third-level item that is the first item of a list that is in the second item of its parent list that is in the fifth item of its parent list, then its numeration of that item is (5, 2, 1). You would need this, for example, to construct a cross reference to that item: “5.b.i”.

One way to work out the numeration is to begin with an empty list and walk up the tree from the initial context item:

  1. If the item you’re at is a list item in a numbered list, then add its number to the beginning of the numeration list and continue.

  2. If the item is anything else, it has no effect on the numeration, just continue with the current list.

“Begin with an empty list” and “add its number” both sound like mutation, but they don’t have to be. You can solve this problem with a recursive function that builds up the values as it goes and returns them all when it finishes.

It’s much more straightforward to do this with xsl:iterate:

<xsl:function name="f:orderedlist-number" as="xs:integer+">
  <xsl:param name="node" as="element(db:listitem)"/>
  <xsl:iterate select="reverse($node/ancestor-or-self::*)">
    <xsl:param name="number" select="()"/>
    <xsl:on-completion select="$number"/>
    <xsl:choose>
      <xsl:when test="self::db:listitem[parent::db:orderedlist]">
        <xsl:next-iteration>
          <xsl:with-param name="number"
                          select="(count(preceding-sibling::db:listitem)
                                   + f:orderedlist-startingnumber(parent::db:orderedlist),
                                   $number)"/>
        </xsl:next-iteration>
      </xsl:when>
      <xsl:otherwise>
        <xsl:next-iteration>
          <xsl:with-param name="number" select="$number"/>
        </xsl:next-iteration>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:iterate>
</xsl:function>
  1. Iterate over the list of our ancestors (reversed so that we’re walking “up” the tree, not down it).

  2. Start with an initially empty list of numbers.

  3. Return that list when we’ve completed all the iterations.

  4. If the current item is a list in a numbered list, construct a new list that consists of the number of this item followed by all the other numbers we’ve seen so far. Pass that value to the next iteration.

  5. Otherwise, pass the current list to the next iteration.

It’s still passing a context forward as it goes, but it doesn’t require understanding function recursion directly and it actually prevents you from making a mistake in your function that prevents it from being tail recursive. (Tail recursion is a specific property of some recursive functions and if a function is tail recursive, the processor can optimize it in ways that it can’t optimize a function that isn’t provably tail recursive.)

New operators

XPath 3.0 introduced the “||” and “!” operators, XPath 3.1 introduced the “=>“ operator. These operators provide syntactically compact alternatives for behavior available in other ways. For the examples that follow, assume $doc is a variable initialized this way:

<xsl:variable name="doc" as="element(doc)">
  <doc>
    <p>one</p>
    <p>two</p>
    <p>three</p>
  </doc>
</xsl:variable>

The double vertical bar operator performs string concatenation. This expression:

<xsl:sequence select="concat('count: ', count($doc/*))"/>

can be written like this with the “||” operator:

<xsl:sequence select="'count: ' || count($doc/*)"/>

The exclamation mark is a simplified form of loop. A loop like this one:

<xsl:for-each select="$doc/p">
  <xsl:sequence select="string-length(.)"/>
</xsl:for-each>

Can be written as:

<xsl:sequence select="$doc/p ! string-length(.)"/>

The expression on the left hand side of the exclamation mark is evaluated. Then, for each item that results, the expression on the right hand side is evaluated with the item from the left as the context item.

Finally, we have “=>“ which may be a little easier to explain after an example. Suppose you have $path which contains the fully qualified name of a file, such as /Users/ndw/Documents/test.xml. For some applications, you might like to get the “base name” of the file, that is, the part after the path and before the extension. There are a number of ways to do this. For this example, we’ll use substring before and after:

<xsl:sequence
  select="substring-before(substring-after($path, '/Documents/'), '.xml')"/>

With the “=>” operator, you can chain the calls together like this:

<xsl:sequence select="$path => substring-after('/Documents/')
                            => substring-before('.xml')"/>

The expression on the left hand side of the operator is applied to the function on the right hand side as the first argument to the function.

All of these operators allow you to write more compact expressions. Whether this is an aid to comprehension or a hindrance is going to depend partly on the experiences of the folks who read your code, even if that’s only you in six months.

What else?

There are other features in XSLT 3.0 that are going to make some stylesheets simpler and easier to write: there are facilities for splitting and merging sequences, functions for transforming between XML and JSON, more flexible ways to copy content, and more. Higher order functions greatly simplify some kinds of problems, especially for developers of stylesheet frameworks. And, as noted in the introduction, this paper ignores both the wide range of new streaming features and packages. There’s a lot in there!

Appendix A. Appendix

This should all be done in XProc. The DocBook XSLT 2.0 Stylesheets perform a series of transformations in XProc 1.0. The DocBook xslTNG stylesheets also perform a series of transformations but there wasn’t time to complete this paper, the stylesheets, and my XProc 3.0 implementation before the conference.

I wasn’t motivated to do the XProc implementation in XProc 1.0, so I’ve taken the expedient approach of tying together the series of transformations in XSLT. This works, but it does have some odd consequences. Partly, perhaps, because I want users to apply the stylesheets in what I consider the usual way: apply the stylesheet to a document.

There are other ways to begin transformations in XSLT 3.0. It’s possible, for example, that requiring the user to specify the global context item and the initial template explicitly would simplify the interface. But that would require even causal stylesheet users to be familiar with these concepts and how to use them in their processing environments.

The approach taken is this:

  1. There are two templates in no mode: one for * and one for /. The one for * immediately defers processing to the one for /; it exists mostly because some of the XSpec tests begin at an element so the template for the document root doesn’t match.

  2. This first template runs the input document through a series of normalizing transformations: cleaning up the logical structure (removing insignificant syntactic variation from the source DocBook documents), dealing with transclusion, profiling, annotations, and XLink linkbases.

  3. This normalized document is passed on to the “main” template which makes several more transformations in different modes.

It’s all a bit messy. Suggestions for improvement most welcome.

References

[TR9502] CALS Table Model Document Type Definition (https://www.oasis-open.org/specs/tm9502.html). SGML Open Technical Memorandum TM 9502:1995. Table Interchange Subcommittee. Harvey Bingham, editor.

[Pygments] Pygments: Python Syntax Highlighter (https://pygments.org/). Version 2.6.1.

[XSLT30] XSL Transformations (XSLT) Version 3.0 (http://www.w3.org/TR/xslt-30/). Michael Kay, editor. W3C Recommendation. 8 June 2017.

Norman Walsh

Norm Walsh is a Senior Software Developer at Saxonica. He has also been an active participant in international standards efforts at both the W3C and OASIS. At the W3C, Norm was chair of the XML Processing Model Working Group, co-chair of the XML Core Working Group, and an editor in the XQuery and XSLT Working Groups. He served for several years as an elected member of the Technical Architecture Group. At OASIS, he was chair of the DocBook Technical Committee for many years and is the author of DocBook: The Definitive Guide. Norm has spent more than twenty years developing commercial and open source software.