Introduction
XSLT has gradually developed over the past decade from a 'browser-based' document
transformer/generator into a fully-fledged, industrial scale, functional processing
tool,
aimed at XML documents. As such its 3.0 version has introduced support for
streaming
processing of very large documents. To do this effectively, the
XSLT specification has had to outline a very extensive, very detailed, and ostensibly
very
complex, set of rules defining the conditions under which a given program can be guaranteed
to
be processable in a streaming manner.
This paper describes an interactive static analysis and display tool that can be used to examine the evaluation of these rules on (fragments of) a given XSLT stylesheet, so developers of processes intended for streaming may understand better whether their programs can indeed be processed in a streaming fashion, and if not, perhaps why not. This tool also includes active linking to appropriate sections of the specification to further increase comprehension.
As the tool was being developed while these rules were being developed, proven and changed by the XSLT Working Group, it was important that the tool itself could be flexible to changes in these rules. [See section “Type model” for a very good case of this happening.] Accordingly, the tool makes significant use of declarative descriptions of sections of the rules rather than direct code, some defined directly within the tool, some as external data files and some extracted from the specification itself, or other parts of the specification definitional framework.
This paper is organised as follows:
-
The model for streaming in XSLT 3.0 is presented and the specification-defined rules are discussed briefly.
-
The tool is illustrated in action on a simple example taken from the XSLT specification.
-
The basic overall design is discussed: analysing the streaming properties, displaying the results and delivering the tool as a web service.
-
XPath expressions embedded in XSLT programs need to be expanded into their parse trees for analysis – the model for doing this is detailed.
-
The model for evaluating the streamability rules is discussed in some detail, with all the streaming properties being attached to the stylesheet tree as attributive properties.
-
Interactive display of the results involves serialisation of the annotated result tree as an HTML page, with styling through CSS and interaction through a Saxon-CE supported XSLT2.0 stylesheet.
-
A final conclusion presents lessons for the development of this type of analysis tool within the XML world.
The entirety of the tool is written in XSLT3.0 and of course is about manipulating XSLT3.0. Thus a reasonable knowledge of XSLT3.0 (or at least 2.0) is assumed on the part of the reader.
Note
This paper discusses interpreting the rules for guaranteed streamability within XSLT, but through the imperfect eyes of the author and the potentially fallible medium of a program that ostensibly follows those rules. In particular the samples and examples in the paper date mainly from the Working Draft specification of December 2013 and several changes have occurred since, especially in a new static type model. The definitive guide is the specification itself, which is, and always will be, the ultimate arbiter. The author assumes no responsibility for errors (positive or negative), omissions and shortcomings. If in doubt read the 20,000 words yourself!
Major points
The development of this tool is based on three major points, which can be relevant to any similar system that needs to analyse properties of an XSLT program:
-
The entire coding can be, and perhaps is best, written in XSLT3.0
-
The best data structure to hold a lot of varied information about an XSLT program is the tree that defines the program itself.
-
Declarative structures (tables, trees, even parts of the specification) can be used effectively to either i) be interpreted to evaluate effect, or even ii) compiled to program sections, or an intermediate computational form such as maps.
Streaming in XSLT3.0
One of the design goals in updating XSLT from version 2.0 to version 3.0 was support
for
processing very large
documents – XML documents whose complete tree
descriptions could not fit in memory, and for which processing or generation of output
would
have to proceed before all input had been read. The full details of the design chosen
can be
found in the Streaming and
Streamability sections
of the XSLT3.0 specification. Two papers on streaming presented at XML Prague 2014
(Braaksma1, Kay) give much more detail.
The basic approach chosen is to declare that a given document should be processed
in a
streaming manner by using the <xsl:stream
href="
doc
">
instruction which
processes the (XML) data of the given document according to the instructions supplied
as
children of the xsl:stream
and returns the result. The essential issue is whether
the instructions requested as a set can process the document without having to either
i)
collect and store the whole document to produce the result or ii)
back-up to parts of the document before the
current context node.
The model used is to examine two contextual properties of the instructions:
posture and sweep, and determine whether the
sequence constructor (the sequence of contained instructions) of the xsl:stream
has a grounded
posture. If so then the processing of the xsl:stream
is guaranteed streamable and a compliant streaming XSLT3.0 processor will
process the document in a streaming manner. Such analysis is completely static and
can be
performed either at compile time (which is what a compiler will need to do) or in
a separate
phase of static analysis and display, which this paper is about.
Partially quoting Kay, these two properties are functions of the construct itself (an XSL instruction, an XPath expression term or a function call), the context posture (i.e. the posture generally of the 'parent') and sometimes the data type. They have the following meanings:
The sweep of a construct
The sweep indicates how much of the input document is needed to evaluate the construct. The values are
Motionless: the construct either doesn’t look at the input document at all, or it only needs to look at the place where the input document is currently positioned.
Consuming: the construct needs to read everything between the current start tag and the corresponding end tag
Free-ranging: the construct potentially needs to read outside the slice of the document represented by the current element and its ancestors.
The posture of a construct
The posture is concerned with determining whether an expression returns nodes from the streamed input document, and if so, where these nodes come from. There are five values:
Grounded: this means that the expression doesn't return nodes from the streamed input. It either returns atomic values (or function items), or it returns nodes from non-streamed documents only.
Striding: this means that the expression returns a set of nodes from the streamed input document, in document order, and that none of these nodes will contain another node in the result (none is an ancestor or descendant of another).
Crawling: again, the expression returns a set of nodes from the streamed input document, in document order, but this time some of the nodes may be ancestors or descendants of others.
Climbing: The specification assumes that when an input document is streamed, a stack of information is retained containing details of the names and attributes of all ancestor elements of the element at which the stream is currently positioned. Any expression that accesses ancestor nodes or their attributes from this stack has a posture of climbing.
Roaming : This indicates that an expression navigates off to parts of the document that aren't accessible when streaming, such as preceding or following siblings.
Streamability rules
The specification provides a very detailed and very large set of rules for determining these properties for a given construct in a given situation within an XSLT stylesheet. (To give a sense of the size, the streamability rules take approximately 20,000 words of the 160,000 in the specification's main body, and the section on streaming itself another 6000 words.) The detail is necessary to ensure that simple constructs, which at first glance should be streamable, actually are – a highly conservative simpler set of rules would exclude many common cases.
Note
Whilst these rules are intended to be complete, Braaksma2, presented at XML London , gives a more informal set of guidelines, intended to support designers creating or refactoring their code for streaming.
These rules split into four general categories: i) a set of General Streamability Rules (usually abbreviated to GSR), ii) a set of specific rules for every XSL instruction, iii) rules for each XPath expression term and iv) rules for all built-in XPath functions. Examples of these rules will be given in later sections.
To analyse the streamability of a given xsl:stream
instruction it is
(usually) necessary to recursively apply these rules to every construct
contained within (every XSL instruction, every XPath term), and in addition any 'external'
resources, such as xsl:template
or xsl:function
(and their
definitions) that may be invoked.[1]
It is anticipated that developers who are designing streamable transformations, will acquire a sense of the spirit of these rules, but to start may have to work through the rules in detail on a given problem. Whilst these calculations can be performed 'by hand', it can be somewhat tortuous, and slow, involving very deep recursions and much scrolling back and forth through the specification[2].
To assist in such early study of streamability, the author has built a tool, funded by Saxonica, to perform such analysis on a given stylesheet and display the results in a form that the intermediate properties and the relevant rules can be explored interactively. The rest of this paper is about the structure of this tool.
Note
The tool is at the time of writing available at Saxonica Community: Streaming Analysis which is intended only to analyse small single stylesheets (no support for inclusion), or some of the specification and W3C test-case examples. Saxonica Ltd reserves the right to withdraw this service without notice and makes no guarantees as to the veracity of the results.
A Quick Tour
The tool is controlled by and presents its results as an XHTML web page, which is usually connected to a server providing the analysis operation. It's best to start off with a quick picture of what the tool provides, operating in this case on one of the examples from the specification:
A stylesheet can be uploaded to the analysis tool, or as in this case, a pre-loaded
example taken from the specification is selected. A serialised version of the source
is
displayed, with styling, line-numbering and fold/unfold controls. The
xsl:template
and xsl:stream
have green backgrounds as the analysis
has concluded that they are guaranteed streamable – if this were not the
case they would have red backgrounds. The implicit sequence constructors have been
displayed
explicitly. Many of the elements of the serialisation are sensitive to mouse-click...
By clicking on the two XPath-containing attributes (@match
and
@select
of lines 2 and 5 respectively) we reveal the full trees representing
the parsing of these expressions, whose properties will become crucial in determining
streamability. By selecting amongst the check-boxes we can then show some of the streaming
properties that have been calculated:
Here we have chosen to display both the role
of each of the XPath
expressions (whose importance will be explained later), and the calculated static
type and posture for each element in both XPath expression
and XSLT instruction constructs. These are displayed in distinctly shortened and styled
forms,
as if they were attributive properties of each element.
Not only is it useful to display the calculated property, but it is also exceptionally
helpful to understand why it has that value. In this case we have displayed the sections
of
the General Streamability Rules
that were triggered, if those rules were used
on that particular construct. If we click on one of these decorations a subsidiary
browser
window or tab shows the first section of the rules which were applied:.
Here case 1.b.iii.A.II
was appropriate (the usage
wasn't
modified). Other relevant portions of the specification can be displayed in a similar
manner –
clicking on the AxisStep
element brings up the specification-defined process for
determination of the streamability of such an expression:
The rest of this paper looks at the details of the design of the tools to achieve these effects.
Basic Design
The tool splits broadly into three sections: i) determining the appropriate streaming properties for all nodes on a stylesheet 'tree', ii) preparing an interactive display of the stylesheet where these properties can be examined and iii) combining these in a web-server such that stylesheets can be uploaded and interactive result web pages returned. With the exception of the web-server package deployment, the analysis tool is built entirely in XSLT3.0, generating an interactive display result which is a combination of (X)HTML, CSS and XSLT2.0 delivered using Saxon-CE.
Determining streaming properties
To analyse the streaming properties for a given stylesheet, we at least need to
recursively descend the stylesheet tree from any xsl:stream
instructions, or
xsl:template
nodes that can be invoked in a streamable mode, and calculate
these properties based on a contextual state, the specific construct and most likely
the
properties of its children, hence the deep recursion. Not only does this process have
to
involve XSLT instructions, it must also involve XPath expressions contained within
various
attributes of those instructions, as they are the mechanisms whereby XSLT selects
data nodes
of interest, and their behaviour in 'moving around' the data tree is critical to
streamability. In effect, for purposes of streamability, the XPath expressions (which
can be
described from their parse trees) are tree-extensions of the main stylesheet, albeit
technically anchored through attributes rather than elemental children.
This process starts by producing a modified version of the XSLT tree, in which additional sub-trees describe these XPath expressions and which has explicit sequence constructors. For example the template:
is transformed into another tree with additional children and attributes:
The sequence constructors have been made explicit as s:sequence-constructor
children[3]. If, for example, the xsl:for-each select="book"
had contained an
xsl:sort
directive, this would not be contained within
the sequence constructor pseudo-child. The XPath expressions contained within attributes
have been placed as children in the xp:
namespace, each having an attribute
(@s:role
) describing the role it takes within the containing instruction,
using a simple naming scheme for identifying expressions contained within attribute
value
templates. (The @l:no
attributes have been added to denote original source line
numbers, for use in eventual display.)
With the expanded XSLT tree, a top-down analysis of the streamability is performed,
by
evaluating the extensive rules outlined in the specification over tree nodes and their
children. The results are returned as a copy of the input tree decorated with a series
of
attributes describing the streaming properties of each node. The inner
xsl:result-document
sub-tree of our example becomes:
(The details of these annotations will be described later.)
Displaying the results
At this point, technically the analysis of the streamability is complete – examining
these streaming properties can assess whether an xsl:stream
instruction is
indeed guaranteed streamable. However to be useful these results should
be displayed in a meaningful way, and with some interaction to restrict the almost
certain
information overload. The basic approach is to convert the result tree into a serialised
HTML pre
, which is styled through CSS, and where the visibility of various
sections can be controlled interactively, in this case using Saxon-CE transforms attached
to
callbacks. The xp:FunctionCall[@name='position']
shown highlighted in Figure 9 is displayed as a line:
which is actually represented as a structure within pre
as[4]:
The stream
span contains all the streaming properties, each being styled
and differentiated via CSS through the @class
attribute. Interaction through
the tool check-boxes toggles the @style
between display:none
and
display:inline
, thus revealing or concealing the properties. The
generalOperand
class contains a hyperlink to the section of the General
Streamability Rules that was used when treating this construct as an operand of its
parent.
How these decorations are added is described later.
Delivering the service
The analysis tool is an XSLT transform that delivers a modified XSLT tree. The display generator is another transform that takes that tree and delivers an (interactive) HTML page. These could be combined into a single package, but we have chosen to implement this analysis as a web service, permitting stylesheets to be uploaded for examination. To do this we've used the Servlex webapp package tool . The general architecture of the delivery platform is:
Apart from delivery of resource classes (e.g. *.html
,*.css
and
*.js
, which includes the Saxon-CE 'compiler'...), three principal messages
are processed by the web package:
analyze.html |
Executes an XSLT transform that generates the main tool page, including collecting all the preloaded examples and forming drop-down selectors to choose them. |
analyzeStream |
Is accompanied by the upload of the source stylesheet[5]which is processed for analysis followed by serialisation of the result into an interactive web-page, which displays in a frame in the main tool. |
XSLT3.0-Spec |
Generates an annotated version of the current W3C specification, mostly in terms
of labelling all the cases in the |
This delivery mechanism will not be described further in this paper, save that we found Servlex to be an excellent vehicle for constructing such a service.
Limitations and assumptions
The tool assumes of course that the stylesheets are well formed XML, and syntactically correct XSLT, or more correctly only analyses them on the basis that they are syntactically correct. Little error checking is performed.
Expanding XPath Expressions
To analyse XSLT streamability it is necessary to examine from where in the XML input
tree
data is being collected by stylesheet instructions. These operations are obviously
described
as XPath expressions, which can be highly compound in nature, such as mixing searches
along
different axes (child, ancestor, following etc.), predicates and a number of built-in
functions. Analysis of streamability has to examine the structures of these expressions
- the
rules are described with reference to the EBNF grammar for XPath, defined in XPath 3.0 Grammar. Thus for our
purposes it is most convenient to generate parse result trees corresponding to the
XPath
expressions contained in attribute values (e.g. @select
) or attribute value
templates (e.g. href="example{position()}.xml"
).
As the analysis is being performed in an XSLT environment, such parsing can be performed most conveniently with a parser itself written in XSLT. Luckily the REx parser generator can be configured to generate parsers in several languages, including XSLT, which can both test against a grammar and built a result XML parse tree. REx can indeed generate a working XPath 3.0 parser to run in XSLT. The nub of the expansion is shown in Figure 13 :
xpath3.0-parse.xslt
links to the transform that has been generated by REx.
xp:parse.xpath()
performs some tactical rewriting of an XPath string (see section “Preprocessing and rewriting”), calls the REx-generated parser and does some post-processing
(namespace remapping, collapsing of singleton leaf sub-trees, etc...) before returning
the
completed parse-tree. The operand
mode generates a child element containing that
XPath parse tree, identified with the role of the expression (in this case the attribute
name)
- this role will be used in later operations to identify different instruction-specific
treatments as far as effect on streaming is concerned. These operands trees are generated
from
a main template shown in Figure 14, where not only are common XPath
carriers (e.g. @select
) processed, but also attributes that are identified as
containing attribute value templates, using the predicate test function
xp:is.AVT()
.
As well as generating the operand trees, this template also:
-
processes instructions that have an implicit selection role, such as
xsl:apply-templates[empty(@select)]
orxsl:next-match
to add the implicit context XPath expression tree, -
assigns a line-number-recording attribute, and
-
collects all the contiguous elements and text nodes of the sequence constructor(s) together under
s:sequence-constructor
elements. The functionxp:is.sequence-constructor()
provides a suitable test – elements which are configurations or parameters of the instruction, such asxsl:param
orxsl:sort
, returnfalse()
.
Inclusions
Stylesheets often include resources from other stylesheets, using
xsl:include
and xsl:import
redirection instructions. For
purposes of streamability analysis they can both be treated similarly (implicit match
priorities are immaterial) and their document bodies are expanded as children of the
instruction. As far as this analysis is concerned, templates, functions and variable
directly within such inclusions are considered top-level
to the outer
stylesheet. (The web-delivered service cannot process relative inclusions from uploaded
stylesheets.)
Applying Streamability Rules
With the complete expanded stylesheet we have all the necessary program information
to
commence the streamability analysis. Whilst the rules are written recursively
top-down
, the author found it helpful to split the process into three
sequential phases during which the tree is modified: required functionally equivalent
rewrites
of some expressions to ensure possible streamability, determination of context focus
and
construct type, followed by assessment of posture and sweep.
Preprocessing and rewriting
There are a number of (equivalence) rewrites defined in the specification that are
required to either i) generate a canonical form or ii) make common constructs streamable.
Some of these are most conveniently applied as textual rewrites to the original string
(e.g.
//
-> /descendant-or-self::node()/
). Others are best applied as
rewrites on the tree, such as Figure 15 where the treat as
expression, forcing document-node()
type, has been parsed to a tree.
Declarative tables
While the tree modifications described in this section are actually carried out by sets of XSLT templates and functions, as much use as possible has been made of declarative tables that define appropriate properties, that the XSLT can interpret to process sections correctly. Using such tables increases flexibility and coherence extensively, collecting all relevant properties together in one place and often making some changes merely altering the value of an attribute.
These descriptions for some XSLT instructions describe i) if they are focus-changing
and
if so, which of their operands control and are controlled by the change, using an
order or
simple proforma, ii) a static type for the instruction, if it is independent of that
determined from context or children and iii) a hyperlink to the relevant streamability
specification section if it is not in the canonical form (e.g.
#streambility-of-xsl-copy
). This description is used to produce a series of
maps relating instruction name to property such as $spec-ref()
,
$staticTypes()
that are used within XSLT processes described later.
For XPath expression constructs we also describe the usage using a
proforma derived from a table (Classifying Expressions) within the specification [6], and membership of a choice group of some of the operands.
For example, the IfExpr
usage is defined to be that the operand having the
if
role has an inspection usage, and both the
then
and else
operands have transmission
usage, as
well defining that the then
and else
operands constitute a
choice-group
(which effectively means that only one of
them, not both, must read the input stream).
Type model
For some constructs the static type is needed to assess streamability properties. [The most common case is assessing the value of a node which is known to be childless, such as an attribute or a text node - in this case no subtree has to be traversed to derive the complete string value.]
Initially the analysis model used a type hierarchy, which for streaming could be
somewhat coarser than can be strictly assessed - all XSL instructions were assessed
as
having static type item()*
, whereas a finer granularity was available, but
not needed, for the streamability analysis. The type hierarchy was defined for the
analysis tool by a tree:
The most common operation required using this type hierarchy was to assess a composite
type for a sequence of operands, calculated as the narrowest type in the type hierarchy
which is the type or super-type of all members of the sequence. This was most readily
assessed using a precomputed map of maps
xp:least-common-supertypes($type0)($type1)
derived from this tree of types.
Subsequent detailed study (see note in the next section) revealed that a more general
model involving union of types was needed. Thus between the first submission of this
paper
and the final publication the type model migrated to a U-type where
types were classified as a partial union of 28 fundamental types (7 nodal, e.g.
element()
; 19 primitive atomic, e.g. xs:string
;
function()
and xs:untypedAtomic
) A sequence is an instance of
a U-type U if every item in the sequence is an instance of one of the
fundamental types in U, considered as a set. For example, the
sequence (23, "Paris")
is an instance of the U-type U{xs:string,
xs:decimal, xs:date}
because both items in the sequence belong to item types in
this U-type. Shorthand forms for common groupings were defined, e.g.
U{N} denotes the union of all the node types.
Luckily the tool could migrate relatively smoothly, by representing a U-type as an
order-insensitive sequence of the constituent fundamental types as xs:string*
(and stored as an attribute value as a whitespace-separated string, that can easily
be
tokenised back to a sequence), with a small number of additional helper maps and
functions, such as $uTypes('N')
and xp:union-type($types as
xs:string*)
. Some of the special cases for expressions, instructions and
functions had to be altered to use these type-determination functions rather than
those
using the type-hierarchy tree.
Context focus and type
Whilst posture and sweep are the main properties to be analysed, two other subsidiary properties need to be assessed: static type and control focus. Whilst this could be achieved contemporary with the posture/sweep analysis, it is somewhat clearer, and certainly easier to debug, to carry this out as a recursive descent/ascent pre-pass.
Certain instructions and expressions change the context focus for evaluation of their
children. For a simple example, xsl:for-each
obviously can (and almost
invariably does) change the sequence of context nodes for evaluation of its descendant
instructions. An xsl:for-each
is said to be focus
changing, its @select
expression (which of course is represented as
an expression tree identified @s:role="select"
) is said to be
focus-controlling and its sequence constructor is
focus-controlled. These are identified on the tree through attributes
s:focus="change|controlling|controlled"
respectively.
During this pass it is also possibly to analyse static type, propagating a
context item type downwards (as a tunneled variable), changing it
through focus-changing instructions, where generally the context item type for the
controlled children is that of the assessed static type of the controlling (child)
operand.
For example the sequence constructor of xsl:for-each select="amount"
will have
a context item type of element(amount)*
as that is the assessed static type of
the XPath expression tree. The context type is recorded for subsequent display as
a
@s:contextType
attribute.
When leaves are reached, either it defines its own type (e.g. FunctionCall
name="position"
has type xs:integer
, which can be inferred from the
function signature; StringLiteral value="foo"
has type xs:string
)
or its static type is the context type (e.g. ContextItemExpr
, aka '.'). On the
way back either there are definitive rules provided (e.g. QuantifiedExpr
has
type xs:boolean
and PostfixExpr A[B]
has a type which is the type
of A
), or it has a sequence composite type, or appropriate union type
calculated as described above.
For expressions the specification gives a table of type determination
formulae
(Determining the Static
Type of a Construct). Whilst the static types defined (e.g. AndExpr
has type xs:boolean
) are determined from entries in the declarative table of
Figure 17 , currently most of these cases are defined by
pattern-matching templates.
Note
The tool proved to be of some worth in this area when I discovered a test-case that
was failing to be streamable, involving the expression xsl:value-of
select="head(/BOOKLIST/BOOKS/ITEM[1]/PRICE/ancestor::*/@*)"
. The issue was that
whilst the static type of head()
was item()?
the instruction
failed streamability (a potentially overlapping sub-tree would have to be traversed
to
determine the textual value, due to the ancestor::*
step.) However if the
type of head()
is inferred to be the same type as its principal argument, in
this case attribute()?
, then that is technically a
childless-node, whose text value can be
retrieved without further movement across the tree. A set of about a dozen functions
(e.g.
subsequence()
) needed such specialist treatment. The XSLT Working Group had
to change the type model to encompass unions of fundamental types (see above) and
classify
functions that used their principal arguments in a transmission usage
(such as head()
) to use such unions.
Resource references
Whilst most of the assessment is carried out in a recursive tree descent/ascent manner,
XSLT (and XPath) constructs can reference non-child resources in three specific ways:
variable / param references, function / named
template calls and template applications. To complete
streaming analysis these references must be examined and require off-tree
mechanisms. We'll discuss each in term:
Variables
Variables can be declared both in XSLT (xsl:variable
and
xsl:param
) and in XPath (let $v :=..
, for $v in
...
) and in both cases the scoping of reference to their
value follows the
following-sibling::*/descendant-or-self::*
compound axis[7]. Processing such references is most simply achieved by
iterating across construct bodies, accumulating maps of processed
variables which are tunneled down through to following-siblings and their descendants.
For
example in assessing static type, sections of the code relating to variables are
approximately:
In the first template, the iteration across the children using
xsl:iterate
accumulates a parameter $variables
as a map which
relates variable name to the in-scope processed variable tree for that name. The tree
value will i) be fully decorated with its properties (in this case including
@staticType
and ii) have had all variables it refers
to in its definition interpolated fully as regards streaming properties. Each child
in
turn is processed with a full current binding of variables passed as a tunneled parameter
$variables
, which can in turn be updated in scope[8].
The second template assesses the static type of a variable as the composite type of
its children, which actually should be null or a singleton - either
the @select
operand, or a single sequence constructor. The final template
shows how the static type is evaluated for a VarRef
construct (the only
construct within the XPath grammar which actually interpolates variable name references),
by lookup in the supplied map.
Functions
Built-in and stylesheet functions are global entities, which may be referenced
(almost) anywhere within the stylesheet tree. Fortunately the streamability rules
only
require knowledge of the type of the result and the required type of the arguments
to
assess the streamability of a call. For built-in functions, the XPath and XQuery Functions and
Operators 3.0 specification uses a definitional XML file
function-catalog.xml
that contains all the data defining each function,
such as signatures, and from which the specification is constructed. The specification
contains a table in Classifying
Calls to Built-In Functions that defines further (usage) properties. By taking a
copy of function-catalog.xml
and adding some minor annotations, we can
construct maps that will both identify type and usage for function calls and their
arguments:
In Figure 20 we show the entry for head()
in
which we have added a transmission
usage property to its sole argument. The
variable $functions
has been constructed from that file as a map keyed by the
function name, each entry containing a map of some properties of that function,
viz. the returnType
and a further map containing
entries for each argument of the last definition (which is usually the most complete),
keyed by position. This map, and others like it are used extensively within the analysis[9]: expanding implicit '.'
arguments for built-in functions (e.g.
name()
being equivalent to name(.)
) is supported by computing
the set of function names for which that is the case as shown
$default-dot-functions
.
Stylesheet functions can be analysed as a global set at top level: type signatures
can
then be recorded as a similar map to that used for built-in functions. Fortunately,
and
certainly avoiding issues of analysing recursive functions, the return type is generalised
as item()*
. Similar mechanisms can be used for named templates.
Applied templates
xsl:apply-templates
invokes pattern-matching processing on each of the
members of their selected sequence. As such complete assessment of their return would
require some indirect assessment. Fortunately as far as streamability is concerned
this is
much simpler – the instruction is assumed to generate results of item()*
type, and the posture/sweep streamability properties can be
determined mostly locally within the xsl:apply-templates
instruction itself:
other templates that may be triggered only have to be assessed as being in a totally
streamable mode-set
.
Assessing sweep and posture
A similar, though much more complex, recursive descent/ascent process is used to determine the posture and sweep. A context posture is propagated downwards, and alters through focus-changing instructions, with generally the context posture for controlled sub-trees being the assessed posture of the controlling operand. The sweep of each of the construct operands is assessed and the composite sweep and posture is then calculated for the ensemble and becomes the sweep and posture for the construct. As much of the analysis as possible is calculated from definition tables described earlier.
The properties are represented on the tree as attributes (e.g.
s:posture="striding"
), so they can be extracted from result through XPath.
But to reduce errors through mistyping, a defined set of global variables, each having
an
attribute()
type and suitable name/value can be defined, such as
$p.grounded
whose value is @s:posture="grounded"
. Moreover, the
most common 'stream failure' results from roaming posture and
free-ranging sweep, so this is abbreviated: $RFR =
($p.roaming,$s.free-ranging)
. Using these variables when setting properties reduces
typing errors extensively, as the complier will of course complain about undefined
variables.
The generic form of processing is a template of the following canonical structure:
Usually all children are evaluated, sometimes when focus changes, altering the
context posture for their evaluation. Then the result is constructed
as a copy of the original node, a series of initial attributes (all existing attributes,
a
hyperlink to the appropriate section of the specification, the context posture etc.)
are
written on by the template expr-init
, followed by the calculated posture and
sweep, again as attributes, and finally the evaluated children are added.
Currently there are 12 primary templates of for assessing xsl:*
XSLT
instructions and 18 for processing xp:*
XPath expression constructs. Many of
these make calls on the General Streamability Rules.
A subsidiary property needed for analysis, usage, described below,
is written on to the tree by push-processing in mode addUsage
, before the main
evaluating is performed, again through mostly push-processing in mode
addPosture
.
Usage
Constructs act as operands for their parents (or sometimes ancestors) and as such the parent can use the information from the operand in several ways, described as the usage property, which again partially quoting Kay, can take the following values:
-
Absorption: the parent expression makes use of information from the entire sub-tree rooted at nodes returned by the operand expression.
-
Inspection: the parent expression makes use of properties of the nodes returned by the operand expression that can be established while positioned at a node's start tag.
-
Transmission: the parent expression returns nodes delivered by the operand expression.
-
Navigation: the parent expression performs arbitrary reordering of the returned nodes, or navigates away from them in arbitrary ways.
As these properties are used quite extensively, the constructs in the tree are
decorated with an @s:usage
attribute in a single pass before posture and
sweep is assessed, by consulting declarations and suitable maps.
General Streamability Rules
Rules for assessing posture and sweep for many of the constructs devolve to some assessment of a set of more general rules with different configured treatments for the individual operands of the construct. For example:
19.8.4.37 Streamability of xsl:value-of
The posture and sweep of
xsl:value-of
follow the general streamability rules. The operand roles and their usages are as follows:
The
select
expression (usage absorption)The
separator
attribute value template (usage absorption)The contained sequence constructor (usage absorption).
The General Streamability Rules, part of which is shown in Figure 5
are a nested tree of calculations and decisions, up to six levels deep, with 25 separate
steps and cases (many of which are applied iteratively to each operand) and an embedded
decision table. To apply these for a given construct we need to i) identify which
of the
child (or sometimes descendant) constructs have influence and what are their mode
of
usage
.
While expanding XPath expressions the role of a construct was attached to the parse
tree as an @s:role
attribute, so simple XPath search can extract necessary
operands. In this case we have already given every node a unique id
so simple
maps of id
and associated property can be used.
The GSR are implemented as a single named XSLT template, which has the following features:
-
The context item is assumed to be the construct element to be assessed.
-
The following parameters can be supplied, or may be derived:
contextPosture
The context posture from the parent
operands
a set of descendant items that should be considered as operands. These can be searched for through a push mode
GS-find-operands
.context-postures
A map of operand
id
s and the associated or required posture for the operand having thatid
. (This becomes useful for complex instructions such asxsl:apply-templates
orxsl:iterate
where the operands can include sections of attribute value templates within child elements, and postures can vary across the set of operands.)children
Possible children of the construct, which will otherwise be evaluated recursively.
The first action in this template is to assess the sweeps of all the operands,
corresponding to step 1 of GSR,
consisting of some dozen tests and a table lookup for each operand. After such processing
each operand element will be decorated with attributes @s:sweep
(a possibly
adjusted sweep), @s:sweepOriginal
(when the sweep was changed by the GSR
rules), @s:usageOriginal
(when the usage has been altered likewise) and
@s:potentialConsumer
if the operand is so judged from rule
1.c
. Parts of the code to do this are:
What Figure 22 is showing is that the code generally tries to
reflect the structure of the rules as laid out in the specification, even using a
map to
represent table 1.b.iii.B
, and with the same abbreviations[10]. Note that the intermediate variable $results
computed for each
operand is typed as a sequence of item()
but is actually a sequence of
map()*
, which can be used to transmit heterogeneous information, in this
case both the properties of direct interest (e.g. sweep) but also the
rule-invoked indicators. Using the XPath simple map operator, we can extract multiple
values through an expression such as $results!.('rule')
.
Note
It would be possible to use the same technique here as is used for posture and sweep
attributes described above, i.e. defining variables which are map entries, e.g.
$map:p.grounded := map{'posture':='grounded'}
, which would reduce the
effect of typing errors.
Assessing the posture is handled similarly, checking the conditions of section 2 over the sweep-assessed operands. Finally the posture and sweep of the construct is determined, written onto a copy of the construct element (together with GSR provenance from section 2) and the children are placed in the new parent.
Evaluating expressions and instructions
Most XPath expression terms can be evaluated with the General Streamability Rules,
suitable usage having been written onto operands from the maps described in section “Declarative tables”. Others require more specialist treatment, such as the
ForExpr
for which the test can be two-stage:
where if the return (the second child) is grounded, the General Streamability Rules
apply, otherwise the construct is roaming
and
free-ranging. The most interesting and perhaps important, are the
AxisStep
terms, which really define movement around the input tree. The
tests (19.8.7.7 Streamability of Axis Steps) are a little more complex and involve six
cases and a tabular form, relating context posture and the axis of travel. Whilst
this
could perhaps be interpreted, in the end an extended xsl:choose
was the
simplest form.
XSLT instructions tend to be more complex in their streamability than XPath
expressions, so while some (e.g. xsl:copy
) are analysed completely with the
General Streamability Rules, many require specialist templates, especially to handle
issues such as instruction configuration elements (e.g. xsl:sort
) and
determining what are the active operands for streamability. To make these somewhat
more
coherent, a number of helper functions are used:
xp:AVT() |
Returns all the nodes of the input that represent attribute value templates in the instruction. |
xp:active() |
Returns all operands of input nodes that are active in
creating a sequence ( |
Similarly, some of the common instruction-varying actions can be cast as templates
in
a specific mode, such as GS-find-operands
, which finds the appropriate
operands for an instruction for application of General Streamability Rules. For example
where for xsl:apply-templates
operands could appear i) as active parts of
the instruction itself (in this case @select
), ii) within active sections of
xsl:param
and xsl:sort
options (either as @select
or sequence constructors) or iii) within attribute value templates
within the xsl:sort
declarations (typically in attributes such as
@order
).
Evaluating built-in functions
Calls to built-in functions (mostly from XPath, but a few, such as key()
,
that are specialist for XSLT) have to be examined in terms of both the streamability
properties of their arguments and the use the function makes of the results of those
arguments. For most functions this can be expressed as an evaluation of the General
Streamability Rules, with suitable usage. In the specification this
is described as a list with an entry of each function using a proforma representation,
e.g.
which indicates that the arguments
have usage navigation, absorption and
inspection respectively. Currently these usages are written onto
the arguments of the function definitions in fn:fold-left(N,A,I)
function-catalog.xml
whence they
are converted into maps as described in section “Functions”, but in theory these
properties could be read from the specification itself. For some dozen functions (such
as
last()
) there are specialist rules - these are handled by simple templates:
where $RFR
denotes roaming and
free-ranging as described earlier.
Determining guaranteed streamabilty
Finally all the components of a top-level xsl:stream
or
xsl:template
have been evaluated against the rules and any sequence
constructor can be examined for a grounded posture and a template
@match
checked for a motionless sweep. The element is
marked @s:streamable
with the boolean satisfaction of these conditions,
whence the analysis of that sub-tree is complete. Later display can give visual
indication. For templates of course they form a modal group which might be invoked
by
other templates - the mode is marked streamable (in a map) only if i) the mode is
declared
streamable and ii) all templates within that mode are themselves proven guaranteed
streamable[11].
Interactive Display
Once the analysis has been completed, the result needs to be displayed. As all the information is attached to the expanded XLST/expression tree as namespaced attributes, we could either i) display the original program in some serialised form and arrange some linkage from that serialisation to appropriate points in the 'shadow tree' or ii) display the whole expanded tree including properties as a serialisation and selectively display desired sections. The second appeared to be the simplest route, albeit at the cost of a very large serialised form even for modest programs.
Serialising to HTML
The tree is serialised to be displayed within a pre
section of the web
page, by a specialist XSLT-coded serialiser with the following features:
-
As the tree is well-formed XML, and space is at a premium, indentation is strict and closing tags are omitted.
-
Line numbering follows the document order of elements in the original XSLT and is displayed at the start of each corresponding line in the serialisation.
-
A fold/unfold group (as a
span
containing two span-contained images, only one of which should be visible) follows for any element in the result tree that has children. -
The element name and primary attributes are written surrounded by classifying
span
s - some note of line length is considered and line-breaks can be interpolated. Some names are shortened and the information is attached to the span class, e.g. thexp:
prefix dropped from XPath constructs. -
The span of elements decorated
@s:streamable
(i.e.xsl:template
,xsl:stream
) are classified asxslstreamyes
orxslstreamno
as appropriate. -
All the streaming properties follow, each with an enveloping
span
and differentiating classes: e.g.class="streamProperty posture"
. Name/values for these properties are simplified, e.g.s:posture="grounded"
displays asp:grounded
. -
Specification hyperlinks are cast as
a[@href]
elements around the appropriate display text. -
All the children of an element exist within a
span class="XMLBody|XSLBody"
on the line following the element head. Thus when folding an element thisspan
is set tostyle="display:none"
. This provides a consistent model for fold/unfold, albeit at the cost of an additional nestedspan
for every element.
The styling is defined by a CSS stylesheet that exploits these classes, starting with
the display
styling all XPath expansions and streaming properties set to
none
[12].
Interactivity
The selection of stylesheets to analyse and examine is a simple use of forms and server response, piping the serialised response HTML into a given target frame. More interesting is the interaction within the analysed stylesheet. There are four types of interaction, three of which are implemented by triggered templates in a simple Saxon-CE executed XSLT 2.0 transform.
Fold/unfold |
The fold/unfold buttons are intercepted by |
Expanding XPath expressions |
All displays of attributes that contain XPath expressions (and
attribute value templates as well) are classed as
|
Displaying streaming properties |
Changes in the state of the check-boxes (which are computed as part of the
returned HTML for an analysed stylesheet) are recognised by |
Specification hyper-linking |
As the result HTML has already been decorated with |
The point to note here is that all the decisions of display
this line?, wrap onto a new line?
are performed by the browser, requiring a
minimalist approach from the analysis tool itself.
Conclusion
This paper has described a tool that performs a very detailed and exhaustive analysis of the streaming properties of an XSLT program, and displays the results in a form where a human designer might be able to examine these properties to either i) understand why a program cannot stream or ii) get a better feel for the interaction between XSLT instructions and XPath expressions and streaming behaviour.
But it could also be considered as an example of analysing a program (or other data structure) for certain properties within an XML-based framework, using a tree-based 'parsing' of the program as the main data structure, adding properties as attributes and processing the tree in a generally top-down recursive manner[14]. Thus some of the lessons from this tool might be pertinent to other situations, which are normally the province of opaque compilers, such as reachability analysis.
Doing what the compiler (usually) doesn't
A comformant XSLT compiler supporting streaming of course has to apply these rules
to
check streamability[15] but does not have to explain why a construct cannot be streamed. This tool
effectively animates the analysis leaving a trail both of its conclusions, in results
and
intermediate data, and pointers to the relevant rules that were applied, making it
less of a
hit-and-miss affair for the designer to acheive his streamability goals. Of course
a
compiler could do similar (e.g. Saxon has an -explain
option that displays the
optimised execution plan) but this tool does this independently of any implementation,
as
the rules are strictly part of the specification.
Controlling the volume of displayed data
One of the problems was the sheer amount of data to be viewed/displayed – some 8-10
additional properties, stored as attributes for every construction element, both instruction
and XPath expression. One option was to display all the properties for a single given
element at a time, perhaps on a status bar, or a popup, but much of the understanding
of the
streaming rules in action comes from examining the properties of all the operands
of a
construct as an ensemble, together with the relevant rules. I chose to enable entire
classes
to be viewed selectively – it certainly permits one to view reverse-cascades
of usually roaming
posture propagating from some errant action.
Alternatives, that could be programmed relatively easily using more detailed Saxon-CE interaction, could show properties for a small portion of the tree at a time (e.g. selected element, direct children and a limited number of ancestors), or even explore graphical symbology and other shorthand forms.
Who & how best to process XSLT?
The problem involved analysis of a program most of whose components are written in XML. Thus an XSLT enthusiast, such as the author, would reach for that tool as the primary instrument. The fact that the program to be analysed was itself XSLT caused very few problems, and made several areas easier.
The first issue was how to analyse the XPath expressions. Initially I chose to add
an
expansion to Saxon to exploit its Expression.explain()
method to generate a
parse tree, but after some success it became clear that a lot of rewrites that Saxon
was
doing internally needed to be undone
or otherwise modified to get to the
constructs that streamability required. Then a switch to a parser, written in XSLT
and
generated by REx, made the situation much clearer and the whole analysis
solution could be written entirely in non-extended XLST 3.0[16].
Probably the most useful lesson is that a simple variant of the source XSLT program, held as an XML tree, can act as its own parse tree, which can be traversed, read and decorated entirely by XSLT programs. Of course properties have to be capable of being grounded to effective strings to attach to tree element nodes, which was possible even at a stretch when the type model moved to a union type. With with a suitable system of indexed map structures held within the analyser, and keyed through unique ids, even this restriction might be overcome.
A syntactically coherent specification
Early on in the development it became imperative that there should be some means of
finding the correct place in the XSLT specification to examine constructs that were
being
evaluated, if only for debugging the tool itself. Whilst much of the specification
is very
richly hyper-linked internally, there was still an enormous amount of scrolling around,
losing a place, having to revert to and search through the table of contents (which
on the
author's browser occupies some 14 pages
) all the while trying to retain a
mental (stack) model in one's head. Could we build a hyperlink from say an
xsl:value-of
element to the relevant section of the specification?
Most fortunately, but certainly by design, many of the sections (in this case
19.8.4.37
of the December 2013 Working Draft) had hyperlink anchors which
were extremely coherent, quite fine-grained and followed the scheme
#streamability-xsl-
{local-name(.)}
, in this
case #streamability-xsl-value-of
. The exceptions (which were often to a parent
category) could be handled by attributive declarations in a table, such as shown in
Figure 17. In the case of the General Streamability Rules, the
combination of nested lists and a table required preprocessing to bury a series of
anchor
points (and div
groupings) to support display.
Equally, the existence of and access to, even more definitive documents behind the
specification, such as function-catalog.xml
, meant we could latch onto and use
definitive information, avoiding transcription errors and making it possible for the
tool to
track eventual changes in some cases automatically.
Problems
Apart from the size and apparent complexity of the streamability rules themselves, a small number of other problems appeared including
-
the size of the resultant fully-serialised output
-
debugging the tool itself
Whilst the analysis of streamability for a modest stylesheet is relatively quick (seconds or less), the serialisation, and particularly the browser display of that serialisation can be lengthy. A glance at Figure 11 shows that each streaming property takes about 100 characters to define for display in the current serialised form, which means each construct (instruction, expression term) takes about 1kB to display. (The serialisation of the example in Figure 1 is just under 14kB long.) Clearly there could be economies in the terms used (class names are over-generous for example) and some server-performed compression of the results and associated CSS might improve matters by perhaps a factor of three. With some more active participation from the Saxon-CE stylesheet, such as encoding/decoding property values perhaps an order of magnitude could be gained[17].
The simplicity of keeping all the analysed data on the XSLT tree and then serialising
the whole was very effective in getting the tool developed, but an alternative would
be to
split off the data just before serialisation and provide that as a separate
id
-mapped data structure that the Saxon-CE stylesheet would use.
Unsurprisingly early debugging of the tool wasn't extremely easy, as the sheer volume
of
data and especially the O(100) different types of instructions and expressions made
focussing on a single problem difficult. Building the tools as a number of phases
helped
(e.g. expanding all the XPath trees, marking the focus control, calculating the static
type)
so that successive phases could be checked to mostly
work[18]. However once a certain stage was reached a lot of the further development could
be carried out merely by altering or expanding tables.
Acknowledgements
Apart from funding the venture, Mike Kay set me a very interesting and what turned out to be a surprisingly complex challenge, but one of XSLT working on XSLT that I very much enjoy. Florent Georges deserves thanks for the excellence of, and excellent support for, the Servlex webapp delivery mechanism which made mounting the tool as a web service comparatively plain (all XML) sailing, with no PHP or Java required! O'Neil Delpratt was his usual cheerful self in helping get the tool installed on Saxonica servers.
Quo vadis?
During the writing of this paper it became clear that more of the tool's behaviour could be derived from direct interpretation of some sections of the specification, thanks mainly to the coherence in its logical structure. This would increase the tolerance of the tool to (modest) changes in the specification. Changes to the serialisation and storage outlined above could be helpful also.
Whilst the original intention of the tool was to enable a given stylesheet to be
analysed, perhaps its future value is really in being able to increase understanding
of what
the essence of the streamability rules is, by providing a highly detailed
walkthrough
of their application on fragments of XSLT of interest. It might
also be interesting to see whether a rudimentary expert system
, in the form
of patterns matching some design metaphors (e.g. adding a copy()
action to
avoid returning nodes from the input) might be able to suggest alterations that might
enable
streamability where originially this cannot be guaranteed.
References
[Braaksma1] Braaksma, Abel: XSLT 3.0 Streaming
for the masses
[online] XML Prague 2014 proceedings, pp29–80,
http://archive.xmlprague.cz/2014/files/xmlprague-2014-proceedings.pdf
[Braaksma2] Braaksma, Abel: Streaming Design
Patterns or: How I Learned to Stop Worrying and Love the Stream
[online] XML London
2014 proceedings, pp24–52, doi:https://doi.org/10.14337/XMLLondon14.Braaksma01,
http://xmllondon.com/2014/xmllondon-2014-proceedings.pdf
[Kay] Kay, Michael: Streaming in the Saxon XSLT
Processor
[online] XML Prague 2014 proceedings, pp81–102
http://archive.xmlprague.cz/2014/files/xmlprague-2014-proceedings.pdf
[REx] Rademacher, Gunther: REx Parser
Generator
[online] http://www.bottlecaps.de/rex/
[Servlex] Georges, Florent: Servlex: (: Web
Applications and REST Services Framework for XQuery, XProc and XSLT. :)
[online]
http://serlvex.net
[1] Applicable templates must be invoked in a mode that has been declared to be
streamable (xsl:mode name=".." streamable="yes"
) so the set of templates to
be examined is restricted.
[2] The author was present when the XSLT Working Group analysed 'by hand' (and conference call) the streamability of a 5 instruction stylesheet, with XPath expressions perhaps 4-5 terms deep. They almost managed to complete the process in about 50 minutes.
[3] We could have used xsl:sequence
which in XSLT3.0 can contain a sequence
constructor (!) but placing it in a separate namespace makes the implementation
tidier.
[4] Multi-line expansion and indentation is shown for clarity, but the single line is
actually flat
.
[5] At present relative indirect stylesheets (e.g. xsl:include
href="more.xsl"
) are not supported, as of course the server cannot
request from the client file system, though 'web-accessible' links could be
followed. A system where all the stylesheets were web-accessible could be
developed easily.
[6] It didn't quite appear that the table was itself regular enough to derive this data from it automatically, but perhaps I should have persisted.
[7] Within our expanded XSLT trees all references will be through elements such as
VarRef
- even text-value-templates will have been
expanded into element trees.
[8] Whilst this of course can be processed using a recursive template, using
xsl:iterate
introduces much more coherence in what is essentially a
contained tail-recursive iteration. Using a high-order function such as
fold()
isn't terribly practical when XSLT instructions predominate.
Equally the immutable map()
of XSLT3.0 makes tracking variable scoping
vastly easier than alternative methods.
[9] It is tempting to see whether the table Classifying Calls to Built-In Functions in the specification is regular enough that the usage can be extracted automatically. On the other hand one can argue that fundamental properties of the function, such as usage, belong in the definitive catalog.
[10] It would be comparatively trivial to parse that map from an even simpler representation, possibly even from the specification itself.
[11] Mutually interacting streamable modes are not supported in this tool.
[12] For large stylesheets 'top-level' constructs (e.g. xsl:template
) could
of course be defined to display in a folded state. The streamability of such entities
would still be visible as red/green backgrounds of course.
[13] Under some arrangements with all being displayed in a single tab/window, it is
possible to enrich the explanation of application of the General Streamability
Rules, by highlighting all the rules that were relevant to a particular case. This
requires i) div
grouping of sections of the specification GSR (which
can be done automatically) and ii) altering the display properties of these
div
sections. Unfortunately this appears not to be possible (as one
would wish) between different tabs or windows within a browser....
[14] A critical requirement might be that referential mechanisms and dependencies (e.g. variables) follow descendant or following-sibling scoping.
[15] Actually they are permitted to extend the cases in which they can stream, but they must support cases which are guaranteed streamable according to the specification rules.
[16] To be fair, Mike Kay had suggested REx as a possibility in the
initial project outline, but the author had had some experience, before working with
Saxonica, on using Expression.explain()
, so that was in the first design.
[17] Technically we might be able to implement the analysis tool in XSLT2.0 and hence consider a total Saxon-CE in-browser solution, but the use of XSLT3.0 facilities (especially mode declarations and maps, as curiously HOFs were confined to a very small number of cases) makes the development very much more straighforward. Of course anyone using streaming must be using XSLT3.0 anyway.
[18] The main phases expand / focus control / static type / usage / posture & sweep
were pretty separable until the new U-type model introduced usage-dependent
type
to complement the existing type-dependent usage
. At this
point some sections of the type and usage phases had to be merged.