Lockett, Debbie, and Michael Kay. “Saxon-JS: XSLT 3.0 in the Browser.” Presented at Balisage: The Markup Conference 2016, Washington, DC, August 2 - 5, 2016. In Proceedings of Balisage: The Markup Conference 2016. Balisage Series on Markup Technologies, vol. 17 (2016). https://doi.org/10.4242/BalisageVol17.Lockett01.
Balisage: The Markup Conference 2016 August 2 - 5, 2016
Debbie Lockett joined the Saxonica development team in 2014 following
post-doctoral research in Pure Mathematics at the University of Leeds. Her Ph.D
and further research generally involved symmetries of infinite relational
structures. Since moving into the "real" world of software development at
Saxonica, Debbie has worked on performance benchmarking, developing the tools
for creating Saxonica's product documentation, and the implementation of XQuery
3.1 features, as well as the development of Saxon-JS.
Michael Kay has been developing the Saxon product since 1998, initially as a
spare-time activity at ICL and then Software AG, but since 2004 within the
Saxonica company which he founded. He holds a Ph.D from the University of
Cambridge where he studied databases under the late Maurice Wilkes, and spent 24
years with ICL, mainly working on the development of database software. He is
the editor of the W3C XSLT specification.
In this paper, we introduce Saxon-JS, an XSLT 3.0 run-time written in pure
JavaScript. We've effectively split the Saxon product into its compile-time and
run-time components. The compiler runs on the server, and generates an intermediate
representation of the compiled and optimized stylesheet in a custom XML format.
Saxon-JS, running on the browser, reads in the compiled stylesheet and executes it.
We describe some particular features of Saxon-JS: the event-handling extensions to
the XSLT language (as used for Saxon-CE), the way that XSLT and JavaScript can
interwork, conformance to the W3C XSLT and XPath specifications, and some details
of
the internal implementation.
XSLT in the browser always had promise. During the period that XSLT 1.0 was under
development, many people thought of it as primarily a client-side technology, and
for
some people, its subsequent success as a server-side technology was a surprise. (The
same thing can be said of Java.) The promise of XSLT in the browser has never been
fulfilled, but the potential benefits are still there. The objectives of separating
content from presentation, and of handling the presentation and user interface using
declarative technologies, remain as strong as they ever were.
XSLT 1.0 in the browser failed to take off largely because it required every browser
to support it. It did get to the point (around 2006) where all the main desktop browsers
had usable and interoperable XSLT 1.0 support, but at just about the same time mobile
browsers started their journey to stardom, and XSLT was one of the first things to
be
dropped in the interests of saving memory footprint. And at the same time, the web
had
changed. (Remember "Web 2.0" and AJAX?) The old model where a web page was something
that the software rendered and the user perused had gone: everything was interactive,
and XSLT had failed to ride the wave.
By the time XSLT 2.0 emerged in 2007, the browser market had fragmented. None of the
browser vendors wanted to upgrade their XSLT processors to 2.0 because there were
no
XSLT 2.0 applications that needed it, and no-one wanted to write XSLT 2.0 applications
until browser support was forthcoming - not just from one browser, but across the
whole
range.
Meanwhile JavaScript was maturing. Implementations were getting faster, the language
was getting richer, portability across browsers was improving, and frameworks like
jQuery were starting to emerge to take some of the pain out of low-level DOM
programming.
Around 2011 Saxonica decided to produce a client-side XSLT 2.0 engine to break this
logjam. It would compile to JavaScript so it could run in any browser with decent
JS
support, and it would have an event-based processing model so it could handle user
interaction as well as static page rendering. The result was Saxon-CE (CE for "client
edition") [Saxon-CE]. It was built by stripping down the Saxon Java
product to its bare essentials, compiling it to JavaScript using Google's GWT
cross-compiler, and adding extensions to the language to handle user events and other
aspects peculiar to the browser environment.
Saxon-CE generated a lot of interest, but there were some serious obstacles to
adoption. From a user point of view, the size of the JavaScript (nearly 1Mb) meant
that
loading up an application was always going to take a few seconds. From Saxonica's
perspective, the fact that we had to fork the Java product to cut the size down meant
that ongoing development was going to be expensive; in addition, we found that testing
new releases was a nightmare, because all the testing had to be done within the browser
(GWT code runs in the browser only). The total dependency on GWT left us exposed (when
something worked on one browser but failed on another there was absolutely nothing
we
could do about it). And commercially, we hit the same problem that so many other XSLT
vendors have struggled with: how do you justify continued investment in a technology
that is bringing in no revenue because users expect to get it for free?
So we decided to try again, but taking a different approach. The result is Saxon-JS,
which we describe in this paper.
The Architecture of Saxon-JS
Saxon-JS is an XSLT 3.0 run-time library written in pure JavaScript. We've effectively
split the Saxon product into its compile-time and run-time components (see Figure 1). The compiler runs on the server, and generates an
intermediate representation of the compiled and optimized stylesheet in a custom XML
format. We call this the "stylesheet export file" (or "SEF" for short). It's the same
compiler whether you want to execute in the browser or on the server. Saxon-JS, running
on the browser, reads in the stylesheet export file and executes it.
Because it only handles the run-time, Saxon-JS is much smaller than Saxon-CE (while
the Saxon-CE JavaScript file is around 900KB, Saxon-JS is less than 650KB, and minified
it is only 220KB), and so we've been able to add a lot of the useful XSLT 3.0 features
like support for maps, arrays, try/catch, and JSON. Being pure JavaScript, we can
target
non-browser environments like Node.js as well as the browser itself. We can modularize
the code so that large but not-always-used features like format-date() only
get pulled in if they are actually used. And we've got vastly more options for testing:
all the heavyweight testing (the 10,000 W3C test cases for XSLT 3.0, and the 20,000
test
cases for XPath 3.1) can be done on a server-side engine like Nashorn or Node.js,
leaving only the interactive capabilities to be tested in the browser. Finally, because
our code is now human-readable JavaScript rather than machine-generated, running it
under the excellent debugging tools found in modern browsers becomes feasible.
In the following sections we describe some particular features of Saxon-JS that may
be
of interest. First we will look at the event-handling extensions to the XSLT language.
Then we will examine the way that XSLT and JavaScript can interwork. We will then
describe how Saxon-JS stacks up against the W3C XSLT and XPath specifications; and
finally we'll point out a few interesting aspects of the internal implementation.
Event Handling in Saxon-JS
Saxon-JS has all the same event-handling machinery as Saxon-CE, so it can be used
to
write fully interactive applications using XSLT's declarative programming model.
The essential insight here is that the rule-based programming paradigm [Rule-Based Programming], which was introduced into XSLT because it's such an effective way
of handling semi-structured data, is essentially identical to the event-based processing
model that has become universal in writing interactive user interfaces. It's based
on
the idea of writing a program as a set of rules each containing a condition under
which
the rule fires, and an action to be performed when the rule is triggered; rules are
designed as far as possible to be independent of each other, and the order of execution
is determined by the order in which events occur, not by anything hard-coded by the
programmer.
In XSLT the "condition" part of a rule is in two parts: the match
attribute is a pattern that describes which XML elements are eligible for processing
by
this rule, and the mode attribute names a phase of processing during which
the rule is active. In the interactive XSLT processing model used by Saxon-CE and
now
Saxon-JS, we use these same two components: the mode attribute now
describes a user-interface event that has occurred (for example, a mouse click), and
the
match attribute identifies the element where it occurred. So a rule
defining what happens when a user clicks on a button might look like this:
The mode names (onlick in this example) reflect the event names in the
JavaScript event model, and the element names in the match attribute are the names
of
HTML elements where the event occurred.
In XSLT 3.0 the fallback rules for what happens when there is no explicit rule that
matches an input event can vary from one mode to another. For interactive events such
as
onclick, the natural rule is to "bubble" the event up to containing
(ancestor) elements: if no onclick event has been defined for a particular
element, but an onclick event has been defined for its immediate container,
then we should pass the event to the parent element. This leads effectively to a default
template rule rather like:
In Saxon-CE this "bubbling" behaviour was hard-coded in the product, but in Saxon-JS
it has been achieved by generalizing the mode-based on-no-match behaviour
defined in the XSLT 3.0 specification [XSLT 3.0].
The processing of each input event is a separate transformation. XSLT 3.0 provides
a
much clearer processing model here. It distinguishes an initial stage of "priming"
a
stylesheet (during which, for example, global variables are evaluated and stylesheet
parameters are supplied) followed by multiple invocations of the stylesheet, each
of
which can supply a different initial node to be processed (the "initial match
selection") and a different initial mode. The initial node and mode correspond directly
to the information available to a JavaScript event processor when a user interaction
event occurs.
The result of such a transformation is typically to rewrite a portion of the HTML
page. This is achieved using the xsl:result-document instruction. The
href attribute of this instruction identifies the fragment of the HTML
page to be modified, typically by giving the ID attribute. If
xsl:result-document is not used, the result of the transformation
resets the entire body of the HTML page.
Additional instructions (in a Saxon-defined namespace) are available to allow
attributes and properties of HTML elements to be modified. These make it easy to
implement common use cases where the effect of clicking a button is to change the
CSS
style of an HTML element.
XSLT interoperability with JavaScript
In an ideal world, applications would be written entirely in XSLT, with no need ever
to write any JavaScript code. In practice, however, there will be functionality that
the
browser only offers via JavaScript interfaces. In addition, it is no longer possible
to
pretend that JavaScript is a second-rate programming language. Although it has its
weaknesses, it can be incredibly powerful, and for an interpreted language, its speed
is
astounding. There are an increasing number of JavaScript libraries offering capability
that is hard to resist.
While the type system of XSLT 1.0 was closely aligned with JavaScript, this ceased
to
be true in XSLT 2.0. Some XDM types such as xs:decimal have no equivalent
at all in JavaScript, and others such as xs:date have subtly different
semantics from the nearest JavaScript equivalent. This creates challenges both for
implementing an XSLT processor in JavaScript, and for designing interfaces that allow
XSLT code to call JavaScript and vice-versa.
Because it is impossible to obtain information about the expectations of a JavaScript
function with respect to the arguments it accepts (even the question of how many
arguments are expected has no answer), there can be no conversion of supplied arguments
to an expected type as occurs in XPath function calling. If you supply a node, the
caller will see a node, even if it was expecting a string. Compounding this problem,
the
XSLT compiler running on the server has no advance knowledge of what functions exist
in
the target execution environment. One approach would be to require users to provide
function signatures in some kind of stylesheet declaration, but this would be very
constricting in the flexible world of the browser, where it is commonplace to test
dynamically whether a function exists before deciding whether to call it. (The
function-available() function in XSLT 1.0 reflects this tradition, but
has been undermined by the move in XSLT 2.0 towards a more statically typed language,
where the set of available functions is expected to be fixed in the static
context.)
Saxon-JS (like Saxon-CE before it) responds to these challenges by making the
interaction with JavaScript as dynamic as possible. For example, there is a namespace
http://saxonica.com/ns/globalJS for calling global JavaScript
functions: a call of the form js:foo(x,y,z) is always accepted by Saxon at
compile-time, and results in a run-time call of the global JavaScript function
foo, or a dynamic error if no such function is defined. A call on
function-available('js:foo') is never evaluated statically, but returns
true at run-time if the global function foo actually exists.
Saxon-JS uses a third-party library, Big [Big.js], to
implement xs:decimal and large xs:integer values (smaller
xs:integer values are handled using the native JavaScript
Number type). XDM strings map to JavaScript strings,
xs:double to Number, date/time types to the JavaScript
Date type (with additional timezone information). Other types such as
durations and xs:QName are implemented entirely within Saxon-JS. In a few
cases conformance has been sacrificed: for example xs:float is implemented
as a JavaScript Number, which is technically non-conformant because the
precision of the result of numeric calculations is too high.
When atomic values of any data type are passed to a JavaScript function, the XDM value
is converted to the nearest JavaScript equivalent. For example xs:decimal
values are converted to JavaScript Number objects. This means of course
that there may be a loss of precision; but it's probably a better choice in most cases
than passing the Big object directly.
Saxon-JS has an advantage over Saxon-CE in that the XDM data model now includes maps.
JavaScript objects returned by an extension function, rather than being treated as
alien
objects that can only be accessed using further extension functions, can now be accessed
directly as maps. For example, it becomes possible to add an interactive extension
function ixsl:style() which returns all the style properties of an HTML
element, as a map. Then, for example, ixsl:style($node)?hidden could be
used to obtain the value of the 'hidden' style property.
Mapping XDM sequences and arrays to JavaScript arrays is not straightforward. Most
XPath constructs work on sequences rather than arrays, but the mapping of XDM sequences
to JavaScript arrays is imperfect, because of the equivalence in XDM of a singleton
(for
example the single xs:integer value 17) to a sequence of length 1. This
inevitably creates an asymmetry whereby a sequence of two, three, or four integers
is
passed to a JavaScript function as an array of integers, but a sequence of a single
integer is passed not as an array, but as a single number. The mapping of XDM arrays
to
JavaScript arrays is much closer, but this then creates a problem in deciding whether
an
array returned by a JavaScript function should be mapped to an XDM array or to an
XDM
sequence.
Conformance with W3C Specifications
At the time of writing, XSLT 3.0 [XSLT 3.0] and XPath 3.1 [XPath 3.1] are very close to being finalized, and they offer a great deal
of functionality that is particularly attractive in the browser environment: notably
support for maps, arrays, and JSON. Support for these specifications has therefore
been
one of the project's objectives.
At the same time, a critical success factor is to keep the Saxon-JS executable as
small as possible, to minimize the time taken to download and parse the code when
an
HTML page is loaded. This means we have to be selective about some of the features
in
the specification that appear to have a high overhead in relation to their
usefulness.
The browser environment is not static, so it makes sense to defer implementation of
features that can exploit imminent advances in the browser platform. To take an example,
implementing the normalize-unicode() function within Saxon-JS would require
a quite disproportionate amount of code, which becomes completely unnecessary once
the
browsers uniformly implement the JavaScript 6 function String.normalize()
which does the same job.
A particular area where these design constraints come to a head is in the area of
regular expression support. XPath regular expressions and JavaScript regular expressions
have significant differences. Implementing a new regular expression engine to provide
the XPath syntax and semantics would require a lot of code, and would probably be
rather
slow. In any case, some users would probably prefer to use JS regular expressions
in
their XPath expressions and regard the question of W3C conformance as somewhat arcane.
But the lack of proper support for Unicode in traditional JavaScript regular expressions
is increasingly an embarassment. JavaScript 6 promises a way forward on this [ECMAScript 6, Unicode-aware regex], but it's not yet available on all
browsers. So what we do is a compromise. We have a flag that users can set to request
pure JavaScript regular expressions. In the absence of this flag, we try to translate
the XPath regular expression into a JavaScript equivalent. For many cases this isn't
difficult; for example the character classes such as \p{Lu} or
\p{IsGreek} can be translated into long lists of individual characters.
A tougher challenge is that on browsers without support for the new "u" flag (which
enables Unicode support in regular expressions), non-BMP characters (those with
codepoints above 65535) are treated as two characters by the JavaScript engine, so
they
match ".." but not ".". For this we're simply going to wait for the new JavaScript
6
facilities to appear, at which point the problem goes away.
With a few exceptions like those noted, Saxon-JS at its first release implements
almost all of XPath 3.1 (notably including support for maps, arrays, and JSON). Support
for XSLT 3.0 is more patchy: we've implemented the really useful features like
try/catch, and compile-time features like text value templates and shadow attributes,
but we have yet to tackle xsl:iterate or accumulators.
The optional xsl:evaluate instruction cannot be implemented in the core
Saxon-JS product because it is a run-time engine only; it does not include an XPath
parser. However, as a separate add-on, we are working on a solution to this: the REx
parser generator from Gunther Rademacher [REx] allows us to generate
an XPath parser written in XSLT or JavaScript, and with some post-processing of the
resulting parse tree we can generate the same XML data structure that the Saxon export
on the server produces (for the stylesheet export file), which of course we already
know
how to evaluate.
Support for optional features has been a low priority. In our first implementation
there is no schema-awareness, no streaming, no serializer, no support for higher-order
functions. Many of these restrictions will probably remain, in the interests of keeping
the product small. (Higher-order functions, however, are very tempting: they have
a very
good fit with the JavaScript world. We will keep this under review.)
Saxon-JS Implementation Notes
In this section we highlight a few points that we hope will be of interest concerning
the internal implementation of Saxon-JS.
First, it's useful to understand something about the stylesheet export file (SEF)
produced on the server by the XSLT compiler. This is essentially a decorated expression
tree. Its format is XML, though we have been considering JSON as an alternative since
this might be faster to load and navigate. The nature of this tree is probably best
illustrated by an example.
Here's a stylesheet function in source XSLT:
<xsl:function name="tour:place-knight" as="xs:integer*">
<!-- This function places a knight on the board at a given square. The returned value is
the supplied board, modified to indicate that the knight reached a given square at a given
move -->
<xsl:param name="move" as="xs:integer"/>
<xsl:param name="board" as="xs:integer*"/>
<xsl:param name="square" as="xs:integer"/>
<!-- integer in range 0..63 -->
<xsl:sequence
select="
for $i in 1 to 64 return
if ($i = $square + 1) then $move else $board[$i]"/>
</xsl:function>
And here's the corresponding part of the SEF, slightly redacted for brevity:
Hopefully much of this is self-explanatory. The element names (let, arith, for, range,
vc [= value comparison]) represent different types of expression, in most cases fairly
directly related to expressions in the source. There is no distinction between XPath
expressions and XSLT instructions. The children of an expression in the tree are the
operands of the expression, distinguished either positionally, or by a role
attribute.
The additional attributes on the tree represent information determined by the compiler
and available to the run-time. This includes information for diagnostics when dynamic
errors occur (module, line), slot numbers allocated on the run-time stack to hold
local
variables, evaluation strategies (eval=7 represents eager evaluation of an
expression that returns a single item; but the run-time is free to ignore this), and
type information (calc="i+i" indicates addition of two integers).
In a more complex example, the expression tree will not always have such a close
relationship to the source. The compiler goes through two processes to generate the
tree: type-checking and optimization. Type-checking typically adds nodes to the tree
representing operations such as atomization, checking of items types and cardinalities,
and conversion of untyped atomic values to some target type.
To demonstate type-checking in an expression tree, here's an example of the beginnings
of a template rule from an XSLT stylesheet (with most of the content removed):
Notice the content of the param element with attribute
name="path" halfway down the above sample. This shows an example of the
type-checking process. First the supplied value is obtained, from a slot, by the
'supplied' operation; this is then atomized by the 'data' operation. If this value
is
untyped, then it is converted to the target type xs:string by 'cvUntyped'.
Then the value is checked against the required type in the 'treat' operation, using
the
type test supplied in its jsTest attribute. The content of this attribute
has been generated at compile-time, to simplify the SEF when we know the run-time
will
be using JavaScript: "return SaxonJS.U.Atomic.string.matches(item);". It
supplies the content for a function with argument 'item' used as the type test, in
this
case checking it is an XDM string.
Another example of a test generated at compile-time, and inserted into the SEF, can
be
seen in the jsTest attribute of the p.nodeTest element. This
provides the content for a function to be used as a node test: "var
q=SaxonJS.U.nameOfNode(item); return SaxonJS.U.isNode(item) &&
item.nodeType==1 && q.uri==''&&q.local=='section';". This
is used for the match pattern for the template rule.
Optimization may produce much more radical re-arrangements of the tree, for example
creation of new local or global variables bound to expressions that have been lifted
out
of loops to prevent repeated evaluation, or inlining of variables and functions.
All this work has been done before Saxon-JS springs into action; it is work that
Saxon-JS does not need to do, because it has already been done. All that Saxon-JS
needs
to do is to interpret the expressions on the tree.
As one might expect, Saxon-JS therefore contains a big switch expression that does
different things depending on the expression type (that is, the element name). For
example, here's the branch that handles an "and" expression:
The code for and expressions is a function that takes as input the node
in the expression tree (expr) and the dynamic context
(context). It makes use of a number of internal functions:
ebv: computes the effective boolean value of an expression
evalChild1, evalChild2: evaluates the first or
second operand (subexpression)
Iter.oneBoolean: converts the JavaScript boolean returned by
ebv() to an XdmBoolean object representing an XDM atomic value,
and constructs an iterator over this single item.
In every case the result of evaluating an expression is an iterator over the resulting
value. This enables short-circuit evaluation of expressions such as
following-sibling::x[1] where the result of the expression can be
evaluated without evaluating all its subexpressions to completion.
The conciseness of the implementation of an and expression is not at all
untypical. The coding style is generally terse, designed to keep the overall size
of the
product as small as possible. Some more complex expressions (such as
xsl:apply-templates) inevitably involve rather more code than this, but
anyone familiar with the source code of the Java Saxon product will be surprised how
much can be achieved in very few lines of code.
Conclusions
Saxon-JS is the latest attempt to meet the challenge of implementing XSLT in the
browser.
The first wave of processors, which were native implementations in the browser,
suffered from the "critical mass" problem whereby browser vendors were reluctant to
invest in the technology in the absence of applications, and web developers were
reluctant to use the technology until it was available (in interoperable form) in
all
browsers; this effect was particularly damaging when it came to investing in XSLT
2.0
support, and the lack of investment in turn meant that XSLT in the browser remained
entrenched in the "Web 1.0" era of static content.
The next attempt was Saxon-CE. This demonstrated the feasibility of creating a
cross-browser XSLT processor relying only on the browsers' support for JavaScript,
and
it also showed how XSLT could be extended into a "Web 2.0" environment with support
for
fully interactive applications. But it suffered because of the limitations of the
GWT
technology used to build it.
Saxon-JS can be seen as a re-implementation of Saxon-CE using more appropriate
technology. Many of its concepts have been pioneered by Saxon-CE and are much liked
by
the small but enthusiastic band of Saxon-CE users. The main innovation in Saxon-JS
is
the fact that the heavy work of stylesheet compilation is done in advance, on the
server, allowing the client-side code to be much more lightweight, and giving room
for
implementation of attractive new features in the W3C specifications such as maps and
arrays. The re-architecting has numerous spin-off benefits such as easier testing
and
debugging of the XSLT engine; in addition it is a much more integral part of the Saxon
product line, which will hopefully have commercial benefits and create a revenue stream
to provide for ongoing development, without which so many otherwise-excellent XSLT
implementations have foundered.
[XPath 3.1] Robie, Jonathan, Dyck, Michael and
Spiegel, Josh, Editors. XML Path Language (XPath) 3.1. World Wide Web
Consortium, 17 December 2015. [online] http://www.w3.org/TR/xpath-31/
Robie, Jonathan, Dyck, Michael and
Spiegel, Josh, Editors. XML Path Language (XPath) 3.1. World Wide Web
Consortium, 17 December 2015. [online] http://www.w3.org/TR/xpath-31/