SaxonJS 3 coding improvements

Debbie Lockett

Abstract

Many of the new SaxonJS 3 features have been developed in response to issues that were originally raised by SaxonJS users. "Here's a problem I'm trying to solve. This is what I can do. But what about X? How can I do it with SaxonJS?"

Sometimes the SaxonJS 2 solutions may be somewhat unsatisfactory - yes we can code that; but the code isn't especially "pretty", or intuitive, or easy to write... or perhaps there remain limitations...

In some cases, there was actually a limitation with the SaxonJS 2 processor - for instance perhaps it is not possible to do what the user wants only with XSLT and IXSL, but instead integrating a JavaScript solution is required. In other cases, the SaxonJS 2 solution is perhaps just rather complicated, and requires putting features together in an unfamiliar way.

In this paper we will look at how new SaxonJS 3 IXSL features can be used to write much cleaner solutions for some problems which were tricky with SaxonJS 2.

Introduction

SaxonJS 3 has been in development for a number of years. The first SaxonJS 3.0 beta release was made available in December 2024. SaxonJS 3 includes a range of changes from the previous major release. Some of the changes are purely internal, and present no difference to the user. But there are also major new features, as well as more minor improvements, with changes in the SaxonJS JavaScript API and new IXSL syntax.

The first beta release of SaxonJS 0.9 was announced at the Balisage 2016 conference and released in July 2016 [Lockett 2016]. The first major release SaxonJS 1.0, an XSLT 3.0 [XSLT 3.0] run-time written in pure JavaScript to run in browsers, came out in February 2017. SaxonJS 1 is a run-time XSLT processor, which executes Stylesheet Export Files (SEFs) compiled using Saxon. The next major release SaxonJS 2.0 followed in June 2020, running on Node.js as well as in the browser, and also providing an alternate SEF compiler on Node.js [Kay 2019].

SaxonJS itself supercedes Saxon-CE, the previous XSLT 2.0 Saxon processor for the browser. Saxon-CE introduced various XSLT extension instructions and functions to enable interactivity in the browser to be programmed directly from XSLT without having to drop down into JavaScript [Delpratt 2013]. These interactive XSLT (IXSL) extensions have been further developed for SaxonJS.

In this paper, we will look at the development of certain new IXSL features for SaxonJS 3 [SaxonJS 3], and how these have been motivated by feedback from users of SaxonJS 2.

When designing new language features, one of the main concerns to consider, alongside functionality, is usability. A language should include features that users want to use, providing ways to do things that are both powerful and intuitive. New syntax should make sense to the user, by fitting in with the existing syntax, and perhaps by being similar to existing solutions in other languages. As well as solving specific problems, designers must consider the wider picture, and provide enough flexibility to enable greatest use. In general, the actual implementation of the features is a lesser concern, or perhaps should not even be a concern at all. In practice of course, languages are often designed by the implementors themselves, and so the implementation can have an impact on design. As implementors as well as designers, we will consider the design of the underlying JavaScript APIs (and their capabilities and limitations), and how new features fit in with existing SaxonJS APIs when developing new language features.

One important aspect of implementation specific to SaxonJS, which influences the design of new features, is the fact that we have two compilers for SEFs. This means that IXSL syntax changes are less favourable, because making them available means implementing changes in Saxon (for XJ-compiling) as well as SaxonJS (for XX-compile and run time). Not only is this more work for us as implementors; but it becomes more troublesome to make features available, and for users to keep track of which versions of Saxon and SaxonJS provide the features. So, where possible, we try to limit IXSL syntax changes to major SaxonJS releases, and group them for a specific Saxon release.

One such major change for SaxonJS 3 is the introduction of promise-based asynchrony with ixsl:promise (see Lockett 2023 and Kay 2020). As well as the new IXSL instruction, this involves a number of new IXSL functions. These are implemented in the XJ-compiler from Saxon 12.5, enabled by specifying that a stylesheet is to be exported for use with SaxonJS 3 by using the -target:JS3 command line option.

This paper focuses on "smaller" improvements for SaxonJS. We will consider three particular examples of problems arising from user feedback, and the solutions we now provide with SaxonJS 3. We will look at what influenced the design of new features in each case, as well as how they are used.

Example 1

Combining the results of processing multiple source documents

Is there a way I can get a bunch of document nodes via ixsl:schedule-action (or some other means), process all of them, and use the result of the processing in a single result document?

This is an example of a problem which is possible to solve with SaxonJS 2, but the solution, and how it works in practice, is not entirely intuitive. New language features in SaxonJS 3 allow a cleaner solution.

Referring to the SaxonJS documentation [SaxonJS 3], the user had understood that the standard code pattern when asynchronously accessing documents using ixsl:schedule-action, is to retrieve a file from a URI, apply some processing and write the result back out via xsl:result-document. But they were interested in combining the results of processing multiple documents, and wondering how this could be done.

With SaxonJS 2 this is possible, but the solution, and how it works, may not be obvious. See below for a code example:

                  
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:ixsl="http://saxonica.com/ns/interactiveXSLT"
  xmlns:xs="http://www.w3.org/2001/XMLSchema" 
  version="3.0" expand-text="yes" exclude-result-prefixes="#all">
  
  <xsl:param name="href1" as="xs:string"/>
  <xsl:param name="href2" as="xs:string"/>
  <xsl:param name="output-file" as="xs:string"/>
 
  <xsl:variable name="doc-uris" select="($href1, $href2)" as="xs:string*"/>
  <xsl:variable name="docs-string" select="string-join($doc-uris, ' ')" as="xs:string"/>
  
  <xsl:template name="xsl:initial-template">
    <root>
      <p>Fetching documents from <xsl:value-of select="$docs-string"/></p>
      <ixsl:schedule-action document="{$docs-string}">
        <xsl:call-template name="action"/>
      </ixsl:schedule-action>
    </root>
  </xsl:template>
  
  <xsl:template name="action">
    <!-- This template is called once for each document fetch. 
      But we want to know when ALL documents have been fetched. 
      So check for this, and only do subsequent processing when all documents are available. -->
    <xsl:variable name="docsAvailable" select="$doc-uris ! doc-available(.)" as="xs:boolean*"/>
    <xsl:variable name="docsAllAvailable" select="not($docsAvailable = false())" as="xs:boolean"/>
    <xsl:if test="$docsAllAvailable">
      <xsl:result-document href="$output-file">
        <out>
          <xsl:for-each select="$doc-uris">
            <wrapper for="{.}"><xsl:sequence select="doc(.)"/></wrapper>
          </xsl:for-each>
        </out>
      </xsl:result-document>
    </xsl:if>
  </xsl:template>
  
</xsl:stylesheet>

The part of this solution which is necessary, but not obvious, is the conditional in the "action" template which checks that all documents are available. This is necessary due to a quirk of what actually happens when the ixsl:schedule-action instruction is used for fetching multiple documents (as specified using the space separated list of URIs in the document attribute). The called "action" template is actually called once for each document fetch (and we may also note that frustratingly, there's no way to know which document fetch has triggered the template), not when ALL documents have been fetched. Internally, a set of asynchronous document fetches have been fired in parallel, and for each of these the subsequent action is to call the "action" template. In order to only process the documents when all of them have been fetched, it is necessary to check for that condition in the "action" template. One way of doing that is shown in the example, where $docsAvailable is a sequence of booleans which records whether or not each document is available at the point the template was called, and $docsAllAvailable is only true if none of the booleans in the sequence $docsAvailable is false, i.e. all documents have been fetched.

So, what about with SaxonJS 3? One of the major changes for SaxonJS 3 is the introduction of ixsl:promise as an alternative to ixsl:schedule-action. This is intended to be an improvement on the previous mechanism for asynchronous processing firstly because it extends the capabilities, making it possible to call more different kinds of asynchronous processes; and secondly because it is more closely aligned with the modern JavaScript mechanism for asynchronous code, which is to use Promises, making them easier to integrate and hopefully easier to use (especially if you are already familiar with using JavaScript Promises).

This is an example where the richer language, and extensions available for use with promises, means that the code can be more precise. See below for a SaxonJS 3 version of the solution:

                  
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:ixsl="http://saxonica.com/ns/interactiveXSLT"
  xmlns:xs="http://www.w3.org/2001/XMLSchema" 
  xmlns:f="http://local.functions/"
  version="3.0" expand-text="yes" exclude-result-prefixes="#all">
  
  <xsl:param name="href1" as="xs:string"/>
  <xsl:param name="href2" as="xs:string"/>
  <xsl:param name="output-file" as="xs:string"/>

  <xsl:variable name="doc-uris" select="($href1, $href2)" as="xs:string*"/>
  
  <xsl:template name="xsl:initial-template">
    <ixsl:promise select="ixsl:all([ixsl:doc($href1), ixsl:doc($href2)])"
      on-completion="f:go#1"
      on-failure="f:fail#1"/>
    <root>
      <p>Fetching documents from: <xsl:value-of select="$doc-uris"/></p>
    </root>
  </xsl:template>
  
  <xsl:function name="f:go" ixsl:updating="true">
    <xsl:param name="docs" as="array(document-node())"/>
    <xsl:result-document href="{$output-file}">
      <out>
        <wrapper for="{$href1}"><xsl:sequence select="$docs(1)"/></wrapper>
        <wrapper for="{$href2}"><xsl:sequence select="$docs(2)"/></wrapper>
      </out>
    </xsl:result-document>
  </xsl:function>
  
  <xsl:function name="f:fail" ixsl:updating="true">
    <xsl:param name="err" as="map(*)"/>
    <xsl:result-document href="{$output-file}">
      <out>Document not available: {$err?message}</out>
    </xsl:result-document>
  </xsl:function>
  
</xsl:stylesheet>

Here ixsl:promise is used to initiate the promise-based processing for the asynchronous document fetches. As specified in the select attribute, the promise is only settled when both documents are fetched. The ixsl:doc() function is used to specify a promise to fetch a document. The ixsl:all() function is used to specify that all input promises (supplied as an array) must be fulfilled. For this concurrency function, the promise rejects if any of the documents can not be fetched. Other promise concurrency functions are also available, which enable different handling depending on precisely what you want - e.g. ixsl:all-settled() settles when all the input promises are settled (either fulfilled or rejected), which is useful if you'd instead always like to know the result of each promise.

This example also demonstrates other benefits of ixsl:promise over ixsl:schedule-action:

The result from the resolved promise is more clearly passed to the subsequent processing via the argument for the f:go() function. With ixsl:schedule-action we rely on different methods to pass the result of the asynchronous action to the subsequent processing - for fetched documents you use doc(), while for HTTP requests the HTTP response is passed to the called template as the context item. With ixsl:promise, the result is always passed as the first argument to the on-completion handler function.
Handling errors during the asynchronous processing was not even part of the ixsl:schedule-action example solution above (as it stands, if any document could not be fetched, you would just get no secondary result document, because $docsAllAvailable would never be true). You could handle this case by adding further code in the "action" template, but it is not trivial. Meanwhile with ixsl:promise, it is very easy to add code for error handling, using the on-failure attribute.

Example 2

Supplying HTTP request headers when fetching documents

How can I supply HTTP request headers for a document fetch, which is then accessed using doc()?

This example demonstrates a small gap in the capabilities of SaxonJS 2. In the end, the solution is a relatively simple update for SaxonJS 3. But it provides an opportunity to look at the options available for fetching documents, and the process of developing new features.

If the user knows the documents to be fetched up front, then they could use the SaxonJS.transform() option documentPool with documents preloaded using SaxonJS.getResource(). Headers can be set when asynchronously loading documents with SaxonJS.getResource(), and the documents in the documentPool can then be accessed using doc(). However this mechanism does not help in the case that the document URIs are only known dynamically.

SaxonJS 2 provides two methods for asynchronously loading documents with ixsl:schedule-action. The simplest method is to use the document attribute, supplying a document URI (or space separated list of URIs) as the value. The fetched document(s) can then accessed in subsequent templates using the doc() function. In its simplest form this looks something like:

                  
<xsl:template name="fetch-document">
   <ixsl:schedule-action document="{$docURL}">
      <xsl:call-template name="render-document"/>
   </ixsl:schedule-action>
</xsl:template>

<xsl:template name="render-document">
   <xsl:result-document href="#target">
      <xsl:apply-templates select="doc($docURL)"/>
   </xsl:result-document>
</xsl:template>

If the user wants to do more than a straightforward document fetch, then they can use the http-request attribute to fully define their HTTP request (e.g. supply request headers, etc.) If the HTTP response body is a document, this can be accessed in the subsequent templates from the XDM map representation of the HTTP response which is passed as the context item, for example:

                  
<xsl:variable name="request" select="
   map{
      'method': 'GET',
      'href': $docURL, 
      'header': map{'Accept': 'text/xml'}
      } "/>
      
<xsl:template name="fetch-document">
   <ixsl:schedule-action http-request="$request">
      <xsl:call-template name="handle-response"/>
   </ixsl:schedule-action>
</xsl:template>

<xsl:template name="handle-response">
   <xsl:context-item as="map(*)" use="required"/>
   <xsl:for-each select="?body">
      <xsl:result-document href="#target">
         <xsl:apply-templates select="."/>
      </xsl:result-document>
   </xsl:for-each>
</xsl:template>

However it is not possible to combine these methods with SaxonJS 2. It is not possible to specify HTTP request headers, dynamically load documents asynchronously and pass them to templates that use doc().

The reason for wanting to be able to use doc() is for code reuse. As was the case for this example, users of SaxonJS will often want to use pre-existing XSLT stylesheets, and not have to make vast changes to run them with SaxonJS. Stylesheets do need to be adapted, in particular to use asynchronous document fetches, but making this easier for users is certainly desirable. So this seems like a reasonable feature to add: it would be useful to give users more control for their document fetches, when they want it.

So, how shall we do this for SaxonJS 3? We started off by asking ourselves: is the implementation not doing something it should? For instance, is it reasonable for users to expect any resources fetched using HTTP requests to be added to the document pool? Perhaps this is simply something that we should be doing, but we are not.

This is worth considering, but since not all HTTP requests are used to fetch resources, it is not clear exactly when we should be adding to the document pool. Also, following the distinction in XPath between accessing XML documents as document nodes and other resources as string representations (the difference between doc() and unparsed-text() functions), in SaxonJS we have two distinct resource pool caches: one for XML documents, and one for other text resources. (So as well as theSaxonJS.transform() option documentPool, there is in fact also another option textResourcePool to supply other preloaded resources.) For HTTP responses, it will not always be obvious which pool we should be adding to. So we would like to give the user responsibility for specifying which pool they wish to use.

There are different ways we could have solved this for SaxonJS 3. The user suggested adding a new IXSL instruction which could be used to add resources to the local cache. But as noted above (see section “Introduction”), implementing IXSL syntax changes is somewhat awkward, since it involves implementing these changes in two compilers as well as the SaxonJS run-time, so alternate solutions are preferable. The solution we have settled on is to add an option 'pool' for HTTP requests, which specifies that the body content of a successful response should be added to the internal cache in one of the resource pools: 'xml' to add to the documentPool (which means it can be accessed using the doc() function) or 'text' to add to the textResourcePool. Although this is part of the IXSL syntax, the HTTP client is implemented using XDM maps to represent the request and response messages, and adding entries to the maps does not require any changes in the compile time implementations, making this a much simpler fix.

In SaxonJS 3 this is available for HTTP requests specified using either ixsl:schedule-action/@http-request, or with the promised-based function ixsl:http-request() used with ixsl:promise. For example:

                  
<!-- Add the 'pool' entry to the HTTP request map to direct SaxonJS 
to add the HTTP response body to its documentPool. -->
<xsl:variable name="request" select="
   map{
      'method': 'GET',
      'href': $docURL, 
      'header': map{'Accept': 'text/xml'},
      'pool': 'xml'
      } "/>
      
<xsl:template name="fetch-document">
   <ixsl:promise select="ixsl:http-request($request)"
      on-completion="f:go#1"/>
</xsl:template>

<xsl:function name="f:go" ixsl:updating="yes">
   <xsl:param name="response" as="map(*)"/>
   <xsl:result-document href="#target">
      <xsl:apply-templates select="doc($docURL)"/>
   </xsl:result-document>
</xsl:function>

Example 3

Supplying a JSON map as the argument to a JavaScript method

How do I supply a JSON map as the argument for a call to a method on a JavaScript object?

This example demonstrates a fundamental challenge for users of SaxonJS: passing objects across the boundary between XDM and JavaScript. Many difficulties arise because of the different object models in XML languages versus JavaScript. This has also been an on-going headache for implementors! In trivial cases, a SaxonJS user should not have to think too much about conversions from XDM to JavaScript, and vice versa. The conversions for strings, booleans, and numbers are straightforward. However, once you get to slightly more complex values, the difference in the data models becomes apparent, and it becomes more necessary for the user to be aware of how SaxonJS converts objects between the data models.

The standard conversions between XDM and JavaScript that SaxonJS uses have been designed to try to be a best fit for general use. For instance, if you are calling a JavaScript method from within XSLT that expects a string to be supplied, then it is intuitive and convenient to supply an XDM string. Internally, SaxonJS converts the XDM string to a JavaScript string. Similarly, if the method returns a JavaScript boolean, then it makes sense for SaxonJS to internally convert this to an XDM boolean as the returned item. Alternatively, if the method returns a general JavaScript object, which does not have a simple equivalent on the XDM side, then the SaxonJS internal conversion works by wrapping this JavaScript object in a wrapper which is "safe" on the XDM side. With SaxonJS, we use the JSValue wrapper, and internally treat this as a valid item type in XDM. In the other direction, if an XDM value is passed to the JavaScript side which does not have a simple equivalent, then the SaxonJS internal conversion works by wrapping this XDM value in an XDMValue wrapper. (Note that since SaxonJS is written in JavaScript, internally we actually represent all XDM values using JavaScript objects.)

In this example, the user wanted to be able to call element.scrollIntoView({behavior: "smooth", block: "start", inline: "nearest"});. They were able to use the ixsl:call() function for calling methods on JavaScript objects, e.g. ixsl:call($targetElement, 'scrollIntoView', []), but not sure how to pass the ScrollIntoViewOptions in the array of arguments.

Intuitively, users may expect to be able to supply an XDM map, but this will not work as expected, because the XDM to JS conversion in SaxonJS does not automatically convert XDM maps to JSON maps. Instead, if an XDM map is supplied here, it will be converted to a XDMValue-wrapped XDM map object in JavaScript, which the scrollIntoView JavaScript method simply ignores.

With SaxonJS 2, no XDM value is converted to a JSON map. Work arounds are possible if you instead construct the JSON map from the JavaScript side - for instance, a JavaScript variable whose value is a JSON map can be passed to the XSLT as a stylesheet parameter, and from within the XSLT, this is treated as a JSValue-wrapped JavaScript object.

What is missing in SaxonJS 2 is a way to construct JSON objects directly from the XSLT side, which avoids the usual SaxonJS conversion from XDM to JS. In SaxonJS 3, the ixsl:json-parse() function is provided which plugs this gap. For example:

                  
<xsl:variable name="targetElement" select="ixsl:page()//div[@id eq 'target']"/>
<xsl:variable name="json-string" as="xs:string">
    {"behavior": "smooth", "block": "start", "inline": "nearest"}
</xsl:variable>
<xsl:sequence select="ixsl:call($targetElement, 'scrollIntoView', [ixsl:json-parse($json-string)])"/>

Note that here we provide the JSON text as a string in a variable to avoid needing to use any character escapes (e.g. for '"'). Users will also find this makes things easier!

The ixsl:json-parse() function parses JSON text supplied in an XDM string, and returns the resulting JavaScript object in a JSValue wrapper. In the example above, when this is passed as an argument for the scrollIntoView() function, SaxonJS removes the JSValue wrapper, and passes the JSON map to the JavaScript function as desired, avoiding any other internal XDM-JS conversion.

As well as the ixsl:json-parse() function, for SaxonJS 3 we have made further changes to try to make it easier for users to pass the objects they want across the boundary between XDM and JavaScript. Unfortunately, the design of the SaxonJS internal conversions between XDM and JavaScript is known to cause difficulties - it will not always do what you want. For instance it can cause problems when dealing with JavaScript arrays, where converting to XDM and back again does not necessarily return the original object. Another new function provided in SaxonJS 3 is ixsl:new() which can be used to construct new JavaScript objects directly from the XDM side. For example ixsl:new('Array', []) constructs an empty JavaScript array in a JSValue wrapper. Furthermore, we have updated a number of existing IXSL functions to enable users to control the conversion of arguments and results, avoiding the internal conversion if they want. For SaxonJS 3, a new argument is added for the existing IXSL functions which pass values from XDM to JavaScript and back. The argument accepts an XDM map which can have boolean entries convert-args and convert-result, allowing users to control the conversions across this boundary and avoid the internal conversion if they want.

Conclusions

In this paper we have presented a few of the new features in SaxonJS 3 that have have been developed from SaxonJS 2 feedback. The new IXSL extensions provide useful additions, and allow for cleaner solutions with SaxonJS 3 for problems which were more challenging to solve with SaxonJS 2. As well as presenting how the new features can be used, we looked at the details behind the problems in SaxonJS 2, and the design and development of the new features. Hopefully this provides a useful introduction to these features and how they can be used, as well as a look "behind the scenes" at the work of designing and implementing SaxonJS.

References

[Delpratt 2013] O'Neil Delpratt and Michael Kay. Multi-user interaction using client-side XSLT. [online] XML Prague 2013 proceedings, pp1–22. https://archive.xmlprague.cz/2013/files/xmlprague-2013-proceedings.pdf

[Kay 2019] Michael Kay and John Lumley. An XSLT compiler written in XSLT: can it perform? [online] XML Prague 2019 proceedings, pp223-254. https://archive.xmlprague.cz/2019/files/xmlprague-2019-proceedings.pdf

[Kay 2020] Michael Kay. Asynchronous XSLT. Balisage: The Markup Conference 2020 proceedings. doi:https://doi.org/10.4242/BalisageVol25.Kay01

[Lockett 2016] Debbie Lockett and Michael Kay. Saxon-JS: XSLT 3.0 in the Browser. Balisage: The Markup Conference 2016 proceedings. doi:https://doi.org/10.4242/BalisageVol17.Lockett01

[Lockett 2023] Debbie Lockett. Asynchrony with Promises in SaxonJS. [online] Presented at Declarative Amsterdam 2023. https://declarative.amsterdam/resources/da/slides/da.2023.lockett.asynchrony.saxonjs.slides.pdf

[XSLT 3.0] Michael Kay, editor. XSL Transformations (XSLT) Version 3.0. World Wide Web Consortium, 19 November 2015. [online] https://www.w3.org/TR/xslt-30/

[SaxonJS 3] SaxonJS 3 documentation. [online] https://www.saxonica.com/saxonjs/documentation3/index.html

Author's keywords for this paper:

SaxonJS; XSLT

Debbie Lockett

Saxonica

`<debbie@saxonica.com>`

Debbie Lockett joined Saxonica back in early 2014 in the days of Saxon 9.6; when XPath 3.0 and XQuery 3.0 were brand new, and XSLT 3.0 was approaching "last call working draft" status. She had no idea what any of these things meant, and has learned everything she knows about software development and XML technologies while at Saxonica. Debbie previously worked as a post-doctoral researcher in Pure Mathematics at the University of Leeds, writing papers on symmetries of infinite relational structures. Debbie has worked on SaxonJS since its inception in 2016, and is now a lead developer.

BalisageThe Markup Conference

Balisage Paper: SaxonJS 3 coding improvements

Debbie Lockett

`<debbie@saxonica.com>`

Table of Contents

Introduction

Example 1

Combining the results of processing multiple source documents

Example 2

Supplying HTTP request headers when fetching documents

Example 3

Supplying a JSON map as the argument to a JavaScript method

Conclusions

References

Author's keywords for this paper:

`<debbie@saxonica.com>`

Balisage Series on Markup Technologies