Lumley, John. “Analysing XSLT Streamability.” Presented at Balisage: The Markup Conference 2014, Washington, DC, August 5 - 8, 2014. In Proceedings of Balisage: The Markup Conference 2014. Balisage Series on Markup Technologies, vol. 13 (2014). https://doi.org/10.4242/BalisageVol13.Lumley01.
Balisage: The Markup Conference 2014 August 5 - 8, 2014
A Cambridge engineer by background, John Lumley created the AI group at Cambridge
Consultants in the early 1980s and then joined HPLabs Bristol as one of its founding
members. He worked there for 25 years, managing and contributing in a variety of
software/systems fields, latterly specialising in XSLT-based document engineering,
in
which he subsequently gained a PhD. He is currently helping develop the Saxon XSLT
processor for Saxonica.
Determining streamability of constructs in XSLT3.0 involves application of a set of
rules that appear to be complex. A tool that analyses these rules on a given stylesheet
has
been developed to help developers understand why sections which were designed with
streaming
might fail the required conditions. This paper discusses the structure of this analysis
tool. The development was funded by Saxonica.
XSLT has gradually developed over the past decade from a 'browser-based' document
transformer/generator into a fully-fledged, industrial scale, functional processing
tool,
aimed at XML documents. As such its 3.0 version has introduced support for
streaming processing of very large documents. To do this effectively, the
XSLT specification has had to outline a very extensive, very detailed, and ostensibly
very
complex, set of rules defining the conditions under which a given program can be guaranteed
to
be processable in a streaming manner.
This paper describes an interactive static analysis and display tool that can be used
to
examine the evaluation of these rules on (fragments of) a given XSLT stylesheet, so
developers
of processes intended for streaming may understand better whether their programs can
indeed be
processed in a streaming fashion, and if not, perhaps why not. This tool also includes
active
linking to appropriate sections of the specification to further increase comprehension.
As the tool was being developed while these rules were being developed, proven and
changed
by the XSLT Working Group, it was important that the tool itself could be flexible
to changes
in these rules. [See section “Type model” for a very good case of this happening.]
Accordingly, the tool makes significant use of declarative descriptions of sections
of the
rules rather than direct code, some defined directly within the tool, some as external
data
files and some extracted from the specification itself, or other parts of the specification
definitional framework.
This paper is organised as follows:
The model for streaming in XSLT 3.0 is presented and the specification-defined rules
are discussed briefly.
The tool is illustrated in action on a simple example taken from the XSLT
specification.
The basic overall design is discussed: analysing the streaming properties, displaying
the results and delivering the tool as a web service.
XPath expressions embedded in XSLT programs need to be expanded into their parse trees
for analysis – the model for doing this is detailed.
The model for evaluating the streamability rules is discussed in some detail, with
all
the streaming properties being attached to the stylesheet tree as attributive
properties.
Interactive display of the results involves serialisation of the annotated result
tree
as an HTML page, with styling through CSS and interaction through a Saxon-CE supported
XSLT2.0 stylesheet.
A final conclusion presents lessons for the development of this type of analysis tool
within the XML world.
The entirety of the tool is written in XSLT3.0 and of course is about manipulating
XSLT3.0. Thus a reasonable knowledge of XSLT3.0 (or at least 2.0) is assumed on the
part of
the reader.
Note
This paper discusses interpreting the rules for guaranteed streamability within XSLT,
but through the imperfect eyes of the author and the potentially fallible medium of
a
program that ostensibly follows those rules. In particular the samples and examples
in the
paper date mainly from the Working Draft specification of December 2013 and several
changes
have occurred since, especially in a new static type model. The definitive guide
is the specification itself, which is, and always will be, the ultimate
arbiter. The author assumes no responsibility for errors (positive or negative),
omissions and shortcomings. If in doubt read the 20,000 words yourself!
Major points
The development of this tool is based on three major points, which can be relevant
to
any similar system that needs to analyse properties of an XSLT program:
The entire coding can be, and perhaps is best, written in XSLT3.0
The best data structure to hold a lot of varied information about an XSLT program
is
the tree that defines the program itself.
Declarative structures (tables, trees, even parts of the specification) can be used
effectively to either i) be interpreted to evaluate effect, or even ii) compiled to
program sections, or an intermediate computational form such as maps.
Streaming in XSLT3.0
One of the design goals in updating XSLT from version 2.0 to version 3.0 was support
for
processing very large documents – XML documents whose complete tree
descriptions could not fit in memory, and for which processing or generation of output
would
have to proceed before all input had been read. The full details of the design chosen
can be
found in the Streaming and
Streamability sections
of the XSLT3.0 specification. Two papers on streaming presented at XML Prague 2014
(Braaksma1, Kay) give much more detail.
The basic approach chosen is to declare that a given document should be processed
in a
streaming manner by using the <xsl:stream
href="doc"> instruction which
processes the (XML) data of the given document according to the instructions supplied
as
children of the xsl:stream and returns the result. The essential issue is whether
the instructions requested as a set can process the document without having to either
i)
collect and store the whole document to produce the result or ii)
back-up to parts of the document before the
current context node.
The model used is to examine two contextual properties of the instructions:
posture and sweep, and determine whether the
sequence constructor (the sequence of contained instructions) of the xsl:stream
has a grounded posture. If so then the processing of the xsl:stream
is guaranteed streamable and a compliant streaming XSLT3.0 processor will
process the document in a streaming manner. Such analysis is completely static and
can be
performed either at compile time (which is what a compiler will need to do) or in
a separate
phase of static analysis and display, which this paper is about.
Partially quoting Kay, these two properties are functions of the
construct itself (an XSL instruction, an XPath expression term or a function call),
the
context posture (i.e. the posture generally of the 'parent') and sometimes the data
type. They
have the following meanings:
The sweep of a construct
The sweep indicates how much of the input document is needed to
evaluate the construct. The values are
Motionless: the construct either doesn’t look at the input
document at all, or it only needs to look at the place where the input document is
currently positioned.
Consuming: the construct needs to read everything between the
current start tag and the corresponding end tag
Free-ranging: the construct potentially needs to read outside
the slice of the document represented by the current element and its ancestors.
The posture of a construct
The posture is concerned with determining whether an expression
returns nodes from the streamed input document, and if so, where these nodes come
from.
There are five values:
Grounded: this means that the expression doesn't return nodes
from the streamed input. It either returns atomic values (or function items), or it
returns nodes from non-streamed documents only.
Striding: this means that the expression returns a set of nodes
from the streamed input document, in document order, and that none of these nodes
will
contain another node in the result (none is an ancestor or descendant of
another).
Crawling: again, the expression returns a set of nodes from the
streamed input document, in document order, but this time some of the nodes may be
ancestors or descendants of others.
Climbing: The specification assumes that when an input document
is streamed, a stack of information is retained containing details of the names and
attributes of all ancestor elements of the element at which the stream is currently
positioned. Any expression that accesses ancestor nodes or their attributes from this
stack has a posture of climbing.
Roaming : This indicates that an expression navigates off to
parts of the document that aren't accessible when streaming, such as preceding or
following siblings.
Streamability rules
The specification provides a very detailed and very large set of
rules for determining these properties for a given construct in a given situation
within an
XSLT stylesheet. (To give a sense of the size, the streamability rules take approximately
20,000 words of the 160,000 in the specification's main body, and the section on streaming
itself another 6000 words.) The detail is necessary to ensure that simple constructs,
which
at first glance should be streamable, actually are – a highly conservative simpler
set of
rules would exclude many common cases.
Note
Whilst these rules are intended to be complete, Braaksma2, presented
at XML London , gives a more informal set of guidelines, intended to support designers
creating or refactoring their code for streaming.
These rules split into four general categories: i) a set of General
Streamability Rules (usually abbreviated to GSR), ii) a set of specific rules
for every XSL instruction, iii) rules for each XPath expression term and iv) rules
for all
built-in XPath functions. Examples of these rules will be given in later sections.
To analyse the streamability of a given xsl:stream instruction it is
(usually) necessary to recursively apply these rules to every construct
contained within (every XSL instruction, every XPath term), and in addition any 'external'
resources, such as xsl:template or xsl:function (and their
definitions) that may be invoked.[1]
It is anticipated that developers who are designing streamable transformations, will
acquire a sense of the spirit of these rules, but to start may have to work through
the
rules in detail on a given problem. Whilst these calculations can be performed 'by
hand', it
can be somewhat tortuous, and slow, involving very deep recursions and much scrolling
back
and forth through the specification[2].
To assist in such early study of streamability, the author has built a tool, funded
by
Saxonica, to perform such analysis on a given stylesheet and display the results in
a form
that the intermediate properties and the relevant rules can be explored interactively.
The
rest of this paper is about the structure of this tool.
Note
The tool is at the time of writing available at Saxonica Community: Streaming
Analysis which is intended only to analyse small single stylesheets (no support
for inclusion), or some of the specification and W3C test-case examples. Saxonica
Ltd
reserves the right to withdraw this service without notice and makes no guarantees
as to
the veracity of the results.
A Quick Tour
The tool is controlled by and presents its results as an XHTML web page, which is
usually
connected to a server providing the analysis operation. It's best to start off with
a quick
picture of what the tool provides, operating in this case on one of the examples from
the
specification:
A stylesheet can be uploaded to the analysis tool, or as in this case, a pre-loaded
example taken from the specification is selected. A serialised version of the source
is
displayed, with styling, line-numbering and fold/unfold controls. The
xsl:template and xsl:stream have green backgrounds as the analysis
has concluded that they are guaranteed streamable – if this were not the
case they would have red backgrounds. The implicit sequence constructors have been
displayed
explicitly. Many of the elements of the serialisation are sensitive to mouse-click...
By clicking on the two XPath-containing attributes (@match and
@select of lines 2 and 5 respectively) we reveal the full trees representing
the parsing of these expressions, whose properties will become crucial in determining
streamability. By selecting amongst the check-boxes we can then show some of the streaming
properties that have been calculated:
Here we have chosen to display both the role of each of the XPath
expressions (whose importance will be explained later), and the calculated static
type and posture for each element in both XPath expression
and XSLT instruction constructs. These are displayed in distinctly shortened and styled
forms,
as if they were attributive properties of each element.
Not only is it useful to display the calculated property, but it is also exceptionally
helpful to understand why it has that value. In this case we have displayed the sections
of
the General Streamability Rules that were triggered, if those rules were used
on that particular construct. If we click on one of these decorations a subsidiary
browser
window or tab shows the first section of the rules which were applied:.
Here case 1.b.iii.A.II was appropriate (the usage wasn't
modified). Other relevant portions of the specification can be displayed in a similar
manner –
clicking on the AxisStep element brings up the specification-defined process for
determination of the streamability of such an expression:
The rest of this paper looks at the details of the design of the tools to achieve
these
effects.
Basic Design
The tool splits broadly into three sections: i) determining the appropriate streaming
properties for all nodes on a stylesheet 'tree', ii) preparing an interactive display
of the
stylesheet where these properties can be examined and iii) combining these in a web-server
such that stylesheets can be uploaded and interactive result web pages returned. With
the
exception of the web-server package deployment, the analysis tool is built entirely
in
XSLT3.0, generating an interactive display result which is a combination of (X)HTML,
CSS and
XSLT2.0 delivered using Saxon-CE.
Determining streaming properties
To analyse the streaming properties for a given stylesheet, we at least need to
recursively descend the stylesheet tree from any xsl:stream instructions, or
xsl:template nodes that can be invoked in a streamable mode, and calculate
these properties based on a contextual state, the specific construct and most likely
the
properties of its children, hence the deep recursion. Not only does this process have
to
involve XSLT instructions, it must also involve XPath expressions contained within
various
attributes of those instructions, as they are the mechanisms whereby XSLT selects
data nodes
of interest, and their behaviour in 'moving around' the data tree is critical to
streamability. In effect, for purposes of streamability, the XPath expressions (which
can be
described from their parse trees) are tree-extensions of the main stylesheet, albeit
technically anchored through attributes rather than elemental children.
This process starts by producing a modified version of the XSLT tree, in which
additional sub-trees describe these XPath expressions and which has explicit sequence
constructors. For example the template:
is transformed into another tree with additional children and attributes:
The sequence constructors have been made explicit as s:sequence-constructor children[3]. If, for example, the xsl:for-each select="book" had contained an
xsl:sort directive, this would not be contained within
the sequence constructor pseudo-child. The XPath expressions contained within attributes
have been placed as children in the xp: namespace, each having an attribute
(@s:role) describing the role it takes within the containing instruction,
using a simple naming scheme for identifying expressions contained within attribute
value
templates. (The @l:no attributes have been added to denote original source line
numbers, for use in eventual display.)
With the expanded XSLT tree, a top-down analysis of the streamability is performed,
by
evaluating the extensive rules outlined in the specification over tree nodes and their
children. The results are returned as a copy of the input tree decorated with a series
of
attributes describing the streaming properties of each node. The inner
xsl:result-document sub-tree of our example becomes:
(The details of these annotations will be described later.)
Displaying the results
At this point, technically the analysis of the streamability is complete – examining
these streaming properties can assess whether an xsl:stream instruction is
indeed guaranteed streamable. However to be useful these results should
be displayed in a meaningful way, and with some interaction to restrict the almost
certain
information overload. The basic approach is to convert the result tree into a serialised
HTML pre, which is styled through CSS, and where the visibility of various
sections can be controlled interactively, in this case using Saxon-CE transforms attached
to
callbacks. The xp:FunctionCall[@name='position'] shown highlighted in Figure 9 is displayed as a line:
which is actually represented as a structure within pre as[4]:
The stream span contains all the streaming properties, each being styled
and differentiated via CSS through the @class attribute. Interaction through
the tool check-boxes toggles the @style between display:none and
display:inline, thus revealing or concealing the properties. The
generalOperand class contains a hyperlink to the section of the General
Streamability Rules that was used when treating this construct as an operand of its
parent.
How these decorations are added is described later.
Delivering the service
The analysis tool is an XSLT transform that delivers a modified XSLT tree. The display
generator is another transform that takes that tree and delivers an (interactive)
HTML page.
These could be combined into a single package, but we have chosen to implement this
analysis
as a web service, permitting stylesheets to be uploaded for examination. To do this
we've
used the Servlex webapp package tool . The general architecture of the
delivery platform is:
Apart from delivery of resource classes (e.g. *.html,*.css and
*.js, which includes the Saxon-CE 'compiler'...), three principal messages
are processed by the web package:
analyze.html
Executes an XSLT transform that generates the main tool page, including collecting
all the preloaded examples and forming drop-down selectors to choose them.
analyzeStream
Is accompanied by the upload of the source stylesheet[5]which is processed for analysis followed by serialisation of the result
into an interactive web-page, which displays in a frame in the main tool.
XSLT3.0-Spec
Generates an annotated version of the current W3C specification, mostly in terms
of labelling all the cases in the General Streamability Rules so they
can linked to by fragment identifiers (e.g. #gsr-1.b.iii.A.II). The
specification is displayed in a separate tab or window for ease of use. (The annotated
specification should be stored in the Servlex web cache, so this regeneration should
be infrequent.)
This delivery mechanism will not be described further in this paper, save that we
found
Servlex to be an excellent vehicle for constructing such a service.
Limitations and assumptions
The tool assumes of course that the stylesheets are well formed XML, and syntactically
correct XSLT, or more correctly only analyses them on the basis that they are syntactically
correct. Little error checking is performed.
Expanding XPath Expressions
To analyse XSLT streamability it is necessary to examine from where in the XML input
tree
data is being collected by stylesheet instructions. These operations are obviously
described
as XPath expressions, which can be highly compound in nature, such as mixing searches
along
different axes (child, ancestor, following etc.), predicates and a number of built-in
functions. Analysis of streamability has to examine the structures of these expressions
- the
rules are described with reference to the EBNF grammar for XPath, defined in XPath 3.0 Grammar. Thus for our
purposes it is most convenient to generate parse result trees corresponding to the
XPath
expressions contained in attribute values (e.g. @select) or attribute value
templates (e.g. href="example{position()}.xml").
As the analysis is being performed in an XSLT environment, such parsing can be performed
most conveniently with a parser itself written in XSLT. Luckily the REx
parser generator can be configured to generate parsers in several languages, including
XSLT,
which can both test against a grammar and built a result XML parse tree. REx can indeed
generate a working XPath 3.0 parser to run in XSLT. The nub of the expansion is shown
in Figure 13 :
xpath3.0-parse.xslt links to the transform that has been generated by REx.
xp:parse.xpath() performs some tactical rewriting of an XPath string (see section “Preprocessing and rewriting”), calls the REx-generated parser and does some post-processing
(namespace remapping, collapsing of singleton leaf sub-trees, etc...) before returning
the
completed parse-tree. The operand mode generates a child element containing that
XPath parse tree, identified with the role of the expression (in this case the attribute
name)
- this role will be used in later operations to identify different instruction-specific
treatments as far as effect on streaming is concerned. These operands trees are generated
from
a main template shown in Figure 14, where not only are common XPath
carriers (e.g. @select) processed, but also attributes that are identified as
containing attribute value templates, using the predicate test function
xp:is.AVT().
As well as generating the operand trees, this template also:
processes instructions that have an implicit selection role, such as
xsl:apply-templates[empty(@select)] or xsl:next-match to add
the implicit context XPath expression tree,
assigns a line-number-recording attribute, and
collects all the contiguous elements and text nodes of the sequence constructor(s)
together under s:sequence-constructor elements. The function
xp:is.sequence-constructor() provides a suitable test – elements which are
configurations or parameters of the instruction, such as xsl:param or
xsl:sort, return false().
Inclusions
Stylesheets often include resources from other stylesheets, using
xsl:include and xsl:import redirection instructions. For
purposes of streamability analysis they can both be treated similarly (implicit match
priorities are immaterial) and their document bodies are expanded as children of the
instruction. As far as this analysis is concerned, templates, functions and variable
directly within such inclusions are considered top-level to the outer
stylesheet. (The web-delivered service cannot process relative inclusions from uploaded
stylesheets.)
Applying Streamability Rules
With the complete expanded stylesheet we have all the necessary program information
to
commence the streamability analysis. Whilst the rules are written recursively
top-down, the author found it helpful to split the process into three
sequential phases during which the tree is modified: required functionally equivalent
rewrites
of some expressions to ensure possible streamability, determination of context focus
and
construct type, followed by assessment of posture and sweep.
Preprocessing and rewriting
There are a number of (equivalence) rewrites defined in the specification that are
required to either i) generate a canonical form or ii) make common constructs streamable.
Some of these are most conveniently applied as textual rewrites to the original string
(e.g.
// -> /descendant-or-self::node()/ ). Others are best applied as
rewrites on the tree, such as Figure 15 where the treat as
expression, forcing document-node() type, has been parsed to a tree.
Declarative tables
While the tree modifications described in this section are actually carried out by
sets
of XSLT templates and functions, as much use as possible has been made of declarative
tables
that define appropriate properties, that the XSLT can interpret to process sections
correctly. Using such tables increases flexibility and coherence extensively, collecting
all
relevant properties together in one place and often making some changes merely altering
the
value of an attribute.
These descriptions for some XSLT instructions describe i) if they are focus-changing
and
if so, which of their operands control and are controlled by the change, using an
order or
simple proforma, ii) a static type for the instruction, if it is independent of that
determined from context or children and iii) a hyperlink to the relevant streamability
specification section if it is not in the canonical form (e.g.
#streambility-of-xsl-copy). This description is used to produce a series of
maps relating instruction name to property such as $spec-ref(),
$staticTypes() that are used within XSLT processes described later.
For XPath expression constructs we also describe the usage using a
proforma derived from a table (Classifying Expressions) within the specification [6], and membership of a choice group of some of the operands.
For example, the IfExpr usage is defined to be that the operand having the
if role has an inspection usage, and both the
then and else operands have transmission usage, as
well defining that the then and else operands constitute a
choice-group (which effectively means that only one of
them, not both, must read the input stream).
Type model
For some constructs the static type is needed to assess
streamability properties. [The most common case is assessing the value of a node which
is
known to be childless, such as an attribute or a text node - in this case no subtree
has
to be traversed to derive the complete string value.]
Initially the analysis model used a type hierarchy, which for streaming could be
somewhat coarser than can be strictly assessed - all XSL instructions were assessed
as
having static type item()*, whereas a finer granularity was available, but
not needed, for the streamability analysis. The type hierarchy was defined for the
analysis tool by a tree:
The most common operation required using this type hierarchy was to assess a composite
type for a sequence of operands, calculated as the narrowest type in the type hierarchy
which is the type or super-type of all members of the sequence. This was most readily
assessed using a precomputed map of maps
xp:least-common-supertypes($type0)($type1) derived from this tree of types.
Subsequent detailed study (see note in the next section) revealed that a more general
model involving union of types was needed. Thus between the first submission of this
paper
and the final publication the type model migrated to a U-type where
types were classified as a partial union of 28 fundamental types (7 nodal, e.g.
element(); 19 primitive atomic, e.g. xs:string;
function() and xs:untypedAtomic) A sequence is an instance of
a U-type U if every item in the sequence is an instance of one of the
fundamental types in U, considered as a set. For example, the
sequence (23, "Paris") is an instance of the U-type U{xs:string,
xs:decimal, xs:date} because both items in the sequence belong to item types in
this U-type. Shorthand forms for common groupings were defined, e.g.
U{N} denotes the union of all the node types.
Luckily the tool could migrate relatively smoothly, by representing a U-type as an
order-insensitive sequence of the constituent fundamental types as xs:string*
(and stored as an attribute value as a whitespace-separated string, that can easily
be
tokenised back to a sequence), with a small number of additional helper maps and
functions, such as $uTypes('N') and xp:union-type($types as
xs:string*). Some of the special cases for expressions, instructions and
functions had to be altered to use these type-determination functions rather than
those
using the type-hierarchy tree.
Context focus and type
Whilst posture and sweep are the main
properties to be analysed, two other subsidiary properties need to be assessed:
static type and control focus. Whilst this could
be achieved contemporary with the posture/sweep analysis, it is somewhat clearer,
and
certainly easier to debug, to carry this out as a recursive descent/ascent pre-pass.
Certain instructions and expressions change the context focus for evaluation of their
children. For a simple example, xsl:for-each obviously can (and almost
invariably does) change the sequence of context nodes for evaluation of its descendant
instructions. An xsl:for-each is said to be focus
changing, its @select expression (which of course is represented as
an expression tree identified @s:role="select") is said to be
focus-controlling and its sequence constructor is
focus-controlled. These are identified on the tree through attributes
s:focus="change|controlling|controlled" respectively.
During this pass it is also possibly to analyse static type, propagating a
context item type downwards (as a tunneled variable), changing it
through focus-changing instructions, where generally the context item type for the
controlled children is that of the assessed static type of the controlling (child)
operand.
For example the sequence constructor of xsl:for-each select="amount" will have
a context item type of element(amount)* as that is the assessed static type of
the XPath expression tree. The context type is recorded for subsequent display as
a
@s:contextType attribute.
When leaves are reached, either it defines its own type (e.g. FunctionCall
name="position" has type xs:integer, which can be inferred from the
function signature; StringLiteral value="foo" has type xs:string)
or its static type is the context type (e.g. ContextItemExpr, aka '.'). On the
way back either there are definitive rules provided (e.g. QuantifiedExpr has
type xs:boolean and PostfixExpr A[B] has a type which is the type
of A), or it has a sequence composite type, or appropriate union type
calculated as described above.
For expressions the specification gives a table of type determination
formulae (Determining the Static
Type of a Construct). Whilst the static types defined (e.g. AndExpr
has type xs:boolean) are determined from entries in the declarative table of
Figure 17 , currently most of these cases are defined by
pattern-matching templates.
Note
The tool proved to be of some worth in this area when I discovered a test-case that
was failing to be streamable, involving the expression xsl:value-of
select="head(/BOOKLIST/BOOKS/ITEM[1]/PRICE/ancestor::*/@*)". The issue was that
whilst the static type of head() was item()? the instruction
failed streamability (a potentially overlapping sub-tree would have to be traversed
to
determine the textual value, due to the ancestor::* step.) However if the
type of head() is inferred to be the same type as its principal argument, in
this case attribute()?, then that is technically a
childless-node, whose text value can be
retrieved without further movement across the tree. A set of about a dozen functions
(e.g.
subsequence()) needed such specialist treatment. The XSLT Working Group had
to change the type model to encompass unions of fundamental types (see above) and
classify
functions that used their principal arguments in a transmission usage
(such as head() ) to use such unions.
Resource references
Whilst most of the assessment is carried out in a recursive tree descent/ascent manner,
XSLT (and XPath) constructs can reference non-child resources in three specific ways:
variable / param references, function / named
template calls and template applications. To complete
streaming analysis these references must be examined and require off-tree
mechanisms. We'll discuss each in term:
Variables
Variables can be declared both in XSLT (xsl:variable and
xsl:param) and in XPath (let $v :=.., for $v in
...) and in both cases the scoping of reference to their
value follows the
following-sibling::*/descendant-or-self::* compound axis[7]. Processing such references is most simply achieved by
iterating across construct bodies, accumulating maps of processed
variables which are tunneled down through to following-siblings and their descendants.
For
example in assessing static type, sections of the code relating to variables are
approximately:
In the first template, the iteration across the children using
xsl:iterate accumulates a parameter $variables as a map which
relates variable name to the in-scope processed variable tree for that name. The tree
value will i) be fully decorated with its properties (in this case including
@staticType and ii) have had all variables it refers
to in its definition interpolated fully as regards streaming properties. Each child
in
turn is processed with a full current binding of variables passed as a tunneled parameter
$variables, which can in turn be updated in scope[8].
The second template assesses the static type of a variable as the composite type of
its children, which actually should be null or a singleton - either
the @select operand, or a single sequence constructor. The final template
shows how the static type is evaluated for a VarRef construct (the only
construct within the XPath grammar which actually interpolates variable name references),
by lookup in the supplied map.
Functions
Built-in and stylesheet functions are global entities, which may be referenced
(almost) anywhere within the stylesheet tree. Fortunately the streamability rules
only
require knowledge of the type of the result and the required type of the arguments
to
assess the streamability of a call. For built-in functions, the XPath and XQuery Functions and
Operators 3.0 specification uses a definitional XML file
function-catalog.xml that contains all the data defining each function,
such as signatures, and from which the specification is constructed. The specification
contains a table in Classifying
Calls to Built-In Functions that defines further (usage) properties. By taking a
copy of function-catalog.xml and adding some minor annotations, we can
construct maps that will both identify type and usage for function calls and their
arguments:
In Figure 20 we show the entry for head() in
which we have added a transmission usage property to its sole argument. The
variable $functions has been constructed from that file as a map keyed by the
function name, each entry containing a map of some properties of that function,
viz. the returnType and a further map containing
entries for each argument of the last definition (which is usually the most complete),
keyed by position. This map, and others like it are used extensively within the analysis[9]: expanding implicit '.' arguments for built-in functions (e.g.
name() being equivalent to name(.)) is supported by computing
the set of function names for which that is the case as shown
$default-dot-functions.
Stylesheet functions can be analysed as a global set at top level: type signatures
can
then be recorded as a similar map to that used for built-in functions. Fortunately,
and
certainly avoiding issues of analysing recursive functions, the return type is generalised
as item()*. Similar mechanisms can be used for named templates.
Applied templates
xsl:apply-templates invokes pattern-matching processing on each of the
members of their selected sequence. As such complete assessment of their return would
require some indirect assessment. Fortunately as far as streamability is concerned
this is
much simpler – the instruction is assumed to generate results of item()*
type, and the posture/sweep streamability properties can be
determined mostly locally within the xsl:apply-templates instruction itself:
other templates that may be triggered only have to be assessed as being in a totally
streamable mode-set.
Assessing sweep and posture
A similar, though much more complex, recursive descent/ascent process is used to
determine the posture and sweep. A context posture is propagated
downwards, and alters through focus-changing instructions, with generally the context
posture for controlled sub-trees being the assessed posture of the
controlling operand. The sweep of each of the construct operands is assessed and the
composite sweep and posture is then calculated for the ensemble and becomes the sweep
and
posture for the construct. As much of the analysis as possible is calculated from
definition
tables described earlier.
The properties are represented on the tree as attributes (e.g.
s:posture="striding"), so they can be extracted from result through XPath.
But to reduce errors through mistyping, a defined set of global variables, each having
an
attribute() type and suitable name/value can be defined, such as
$p.grounded whose value is @s:posture="grounded". Moreover, the
most common 'stream failure' results from roaming posture and
free-ranging sweep, so this is abbreviated: $RFR =
($p.roaming,$s.free-ranging). Using these variables when setting properties reduces
typing errors extensively, as the complier will of course complain about undefined
variables.
The generic form of processing is a template of the following canonical
structure:
Usually all children are evaluated, sometimes when focus changes, altering the
context posture for their evaluation. Then the result is constructed
as a copy of the original node, a series of initial attributes (all existing attributes,
a
hyperlink to the appropriate section of the specification, the context posture etc.)
are
written on by the template expr-init, followed by the calculated posture and
sweep, again as attributes, and finally the evaluated children are added.
Currently there are 12 primary templates of for assessing xsl:* XSLT
instructions and 18 for processing xp:* XPath expression constructs. Many of
these make calls on the General Streamability Rules.
A subsidiary property needed for analysis, usage, described below,
is written on to the tree by push-processing in mode addUsage, before the main
evaluating is performed, again through mostly push-processing in mode
addPosture.
Usage
Constructs act as operands for their parents (or sometimes
ancestors) and as such the parent can use the information from the operand in several
ways, described as the usage property, which again partially quoting
Kay, can take the following values:
Absorption: the parent expression makes use of information
from the entire sub-tree rooted at nodes returned by the operand expression.
Inspection: the parent expression makes use of properties of
the nodes returned by the operand expression that can be established while positioned
at a node's start tag.
Transmission: the parent expression returns nodes delivered
by the operand expression.
Navigation: the parent expression performs arbitrary
reordering of the returned nodes, or navigates away from them in arbitrary
ways.
As these properties are used quite extensively, the constructs in the tree are
decorated with an @s:usage attribute in a single pass before posture and
sweep is assessed, by consulting declarations and suitable maps.
General Streamability Rules
Rules for assessing posture and sweep for many of the constructs devolve to some
assessment of a set of more general rules with different configured treatments for
the
individual operands of the construct. For example:
19.8.4.37 Streamability of xsl:value-of
The posture and sweep of xsl:value-of follow the general streamability
rules. The operand roles and their usages are as follows:
The select expression (usage
absorption)
The separator attribute value template (usage
absorption)
The contained sequence constructor (usage
absorption).
The General Streamability Rules, part of which is shown in Figure 5
are a nested tree of calculations and decisions, up to six levels deep, with 25 separate
steps and cases (many of which are applied iteratively to each operand) and an embedded
decision table. To apply these for a given construct we need to i) identify which
of the
child (or sometimes descendant) constructs have influence and what are their mode
of
usage.
While expanding XPath expressions the role of a construct was attached to the parse
tree as an @s:role attribute, so simple XPath search can extract necessary
operands. In this case we have already given every node a unique id so simple
maps of id and associated property can be used.
The GSR are implemented as a single named XSLT template, which has the following
features:
The context item is assumed to be the construct element to be assessed.
The following parameters can be supplied, or may be derived:
contextPosture
The context posture from the parent
operands
a set of descendant items that should be considered as operands. These can
be searched for through a push mode GS-find-operands.
context-postures
A map of operand ids and the associated or required posture for
the operand having that id. (This becomes useful for complex
instructions such as xsl:apply-templates or
xsl:iterate where the operands can include sections of attribute
value templates within child elements, and postures can vary across the set of
operands.)
children
Possible children of the construct, which will otherwise be evaluated
recursively.
The first action in this template is to assess the sweeps of all the operands,
corresponding to step 1 of GSR,
consisting of some dozen tests and a table lookup for each operand. After such processing
each operand element will be decorated with attributes @s:sweep (a possibly
adjusted sweep), @s:sweepOriginal (when the sweep was changed by the GSR
rules), @s:usageOriginal (when the usage has been altered likewise) and
@s:potentialConsumer if the operand is so judged from rule
1.c. Parts of the code to do this are:
What Figure 22 is showing is that the code generally tries to
reflect the structure of the rules as laid out in the specification, even using a
map to
represent table 1.b.iii.B, and with the same abbreviations[10]. Note that the intermediate variable $results computed for each
operand is typed as a sequence of item() but is actually a sequence of
map()*, which can be used to transmit heterogeneous information, in this
case both the properties of direct interest (e.g. sweep) but also the
rule-invoked indicators. Using the XPath simple map operator, we can extract multiple
values through an expression such as $results!.('rule').
Note
It would be possible to use the same technique here as is used for posture and sweep
attributes described above, i.e. defining variables which are map entries, e.g.
$map:p.grounded := map{'posture':='grounded'}, which would reduce the
effect of typing errors.
Assessing the posture is handled similarly, checking the conditions of section 2 over
the sweep-assessed operands. Finally the posture and sweep of the construct is determined,
written onto a copy of the construct element (together with GSR provenance from section
2)
and the children are placed in the new parent.
Evaluating expressions and instructions
Most XPath expression terms can be evaluated with the General Streamability Rules,
suitable usage having been written onto operands from the maps described in section “Declarative tables”. Others require more specialist treatment, such as the
ForExpr for which the test can be two-stage:
where if the return (the second child) is grounded, the General Streamability Rules
apply, otherwise the construct is roaming and
free-ranging. The most interesting and perhaps important, are the
AxisStep terms, which really define movement around the input tree. The
tests (19.8.7.7 Streamability of Axis Steps) are a little more complex and involve six
cases and a tabular form, relating context posture and the axis of travel. Whilst
this
could perhaps be interpreted, in the end an extended xsl:choose was the
simplest form.
XSLT instructions tend to be more complex in their streamability than XPath
expressions, so while some (e.g. xsl:copy) are analysed completely with the
General Streamability Rules, many require specialist templates, especially to handle
issues such as instruction configuration elements (e.g. xsl:sort) and
determining what are the active operands for streamability. To make these somewhat
more
coherent, a number of helper functions are used:
xp:AVT()
Returns all the nodes of the input that represent attribute value
templates in the instruction.
xp:active()
Returns all operands of input nodes that are active in
creating a sequence (@select or
s:sequence-constructor)
Similarly, some of the common instruction-varying actions can be cast as templates
in
a specific mode, such as GS-find-operands, which finds the appropriate
operands for an instruction for application of General Streamability Rules. For example
where for xsl:apply-templates operands could appear i) as active parts of
the instruction itself (in this case @select), ii) within active sections of
xsl:param and xsl:sort options (either as @select
or sequence constructors) or iii) within attribute value templates
within the xsl:sort declarations (typically in attributes such as
@order).
Evaluating built-in functions
Calls to built-in functions (mostly from XPath, but a few, such as key(),
that are specialist for XSLT) have to be examined in terms of both the streamability
properties of their arguments and the use the function makes of the results of those
arguments. For most functions this can be expressed as an evaluation of the General
Streamability Rules, with suitable usage. In the specification this
is described as a list with an entry of each function using a proforma representation,
e.g. fn:fold-left(N,A,I) which indicates that the arguments
have usage navigation, absorption and
inspection respectively. Currently these usages are written onto
the arguments of the function definitions in function-catalog.xml whence they
are converted into maps as described in section “Functions”, but in theory these
properties could be read from the specification itself. For some dozen functions (such
as
last()) there are specialist rules - these are handled by simple templates:
where $RFR denotes roaming and
free-ranging as described earlier.
Determining guaranteed streamabilty
Finally all the components of a top-level xsl:stream or
xsl:template have been evaluated against the rules and any sequence
constructor can be examined for a grounded posture and a template
@match checked for a motionless sweep. The element is
marked @s:streamable with the boolean satisfaction of these conditions,
whence the analysis of that sub-tree is complete. Later display can give visual
indication. For templates of course they form a modal group which might be invoked
by
other templates - the mode is marked streamable (in a map) only if i) the mode is
declared
streamable and ii) all templates within that mode are themselves proven guaranteed
streamable[11].
Interactive Display
Once the analysis has been completed, the result needs to be displayed. As all the
information is attached to the expanded XLST/expression tree as namespaced attributes,
we
could either i) display the original program in some serialised form and arrange some
linkage
from that serialisation to appropriate points in the 'shadow tree' or ii) display
the whole
expanded tree including properties as a serialisation and selectively display desired
sections. The second appeared to be the simplest route, albeit at the cost of a very
large serialised form even for modest programs.
Serialising to HTML
The tree is serialised to be displayed within a pre section of the web
page, by a specialist XSLT-coded serialiser with the following features:
As the tree is well-formed XML, and space is at a premium, indentation is strict and
closing tags are omitted.
Line numbering follows the document order of elements in the original XSLT and is
displayed at the start of each corresponding line in the serialisation.
A fold/unfold group (as a span containing two span-contained images,
only one of which should be visible) follows for any element in the result tree that
has
children.
The element name and primary attributes are written surrounded by classifying
spans - some note of line length is considered and line-breaks can be
interpolated. Some names are shortened and the information is attached to the span
class, e.g. the xp: prefix dropped from XPath constructs.
The span of elements decorated @s:streamable (i.e.
xsl:template, xsl:stream) are classified as
xslstreamyes or xslstreamno as appropriate.
All the streaming properties follow, each with an enveloping span and
differentiating classes: e.g. class="streamProperty posture". Name/values
for these properties are simplified, e.g. s:posture="grounded" displays as
p:grounded.
Specification hyperlinks are cast as a[@href] elements around the
appropriate display text.
All the children of an element exist within a span
class="XMLBody|XSLBody" on the line following the element head. Thus when
folding an element this span is set to style="display:none".
This provides a consistent model for fold/unfold, albeit at the cost of an additional
nested span for every element.
The styling is defined by a CSS stylesheet that exploits these classes, starting with
the display styling all XPath expansions and streaming properties set to
none[12].
Interactivity
The selection of stylesheets to analyse and examine is a simple use of forms and server
response, piping the serialised response HTML into a given target frame. More interesting
is
the interaction within the analysed stylesheet. There are four types of interaction,
three
of which are implemented by triggered templates in a simple Saxon-CE executed XSLT
2.0
transform.
Fold/unfold
The fold/unfold buttons are intercepted by <xsl:template
match="span[@class='folder']/span[@class=('collapse','expand')]"
mode="ixsl:onclick"> which arranges to swap the visibility of the
collapse and expand buttons, and then
proceeds to change the display style of the following body span
(@class=('XMLBody','XSLBody')) accordingly.
Expanding XPath expressions
All displays of attributes that contain XPath expressions (and
attribute value templates as well) are classed as
XPExpr, together with an attribute on the span that identifies the
role. <xsl:template match="span[@class='XPExpr']"
mode="ixsl:onclick"> when triggered, searches in the following instruction
body serialisation for the span that contains the tree for that
expression, marked with a @role attribute, and then toggles the state of
the display property.
Displaying streaming properties
Changes in the state of the check-boxes (which are computed as part of the
returned HTML for an analysed stylesheet) are recognised by <xsl:template
match="input[@class='showType']" mode="ixsl:onclick">, which then alter the
display state of all the effected stream property spans through
ixsl:page()//pre//span[@type=$type], using the
ixsl:set-attribute instruction. (This also of course changes display
state of properties which are invisible for higher reasons, such as being in a folded
structure, but it maintains coherence.)
Specification hyper-linking
As the result HTML has already been decorated with a[@href] links to
the appropriate section anchors in the (modified) specification, these links operate
outside the purview of the Saxon-CE based stylesheet. The specification is displayed
typically in a separate window or tab[13].
The point to note here is that all the decisions of display
this line?, wrap onto a new line? are performed by the browser, requiring a
minimalist approach from the analysis tool itself.
Conclusion
This paper has described a tool that performs a very detailed and exhaustive analysis
of
the streaming properties of an XSLT program, and displays the results in a form where
a human
designer might be able to examine these properties to either i) understand why a program
cannot stream or ii) get a better feel for the interaction between XSLT instructions
and XPath
expressions and streaming behaviour.
But it could also be considered as an example of analysing a program (or other data
structure) for certain properties within an XML-based framework, using a tree-based
'parsing'
of the program as the main data structure, adding properties as attributes and processing
the
tree in a generally top-down recursive manner[14]. Thus some of the lessons from this tool might be pertinent to other situations,
which are normally the province of opaque compilers, such as reachability analysis.
Doing what the compiler (usually) doesn't
A comformant XSLT compiler supporting streaming of course has to apply these rules
to
check streamability[15] but does not have to explain why a construct cannot be streamed. This tool
effectively animates the analysis leaving a trail both of its conclusions, in results
and
intermediate data, and pointers to the relevant rules that were applied, making it
less of a
hit-and-miss affair for the designer to acheive his streamability goals. Of course
a
compiler could do similar (e.g. Saxon has an -explain option that displays the
optimised execution plan) but this tool does this independently of any implementation,
as
the rules are strictly part of the specification.
Controlling the volume of displayed data
One of the problems was the sheer amount of data to be viewed/displayed – some 8-10
additional properties, stored as attributes for every construction element, both instruction
and XPath expression. One option was to display all the properties for a single given
element at a time, perhaps on a status bar, or a popup, but much of the understanding
of the
streaming rules in action comes from examining the properties of all the operands
of a
construct as an ensemble, together with the relevant rules. I chose to enable entire
classes
to be viewed selectively – it certainly permits one to view reverse-cascades
of usually roaming posture propagating from some errant action.
Alternatives, that could be programmed relatively easily using more detailed Saxon-CE
interaction, could show properties for a small portion of the tree at a time (e.g.
selected
element, direct children and a limited number of ancestors), or even explore graphical
symbology and other shorthand forms.
Who & how best to process XSLT?
The problem involved analysis of a program most of whose components are written in
XML.
Thus an XSLT enthusiast, such as the author, would reach for that tool as the primary
instrument. The fact that the program to be analysed was itself XSLT caused very few
problems, and made several areas easier.
The first issue was how to analyse the XPath expressions. Initially I chose to add
an
expansion to Saxon to exploit its Expression.explain() method to generate a
parse tree, but after some success it became clear that a lot of rewrites that Saxon
was
doing internally needed to be undone or otherwise modified to get to the
constructs that streamability required. Then a switch to a parser, written in XSLT
and
generated by REx, made the situation much clearer and the whole analysis
solution could be written entirely in non-extended XLST 3.0[16].
Probably the most useful lesson is that a simple variant of the source XSLT program,
held as an XML tree, can act as its own parse tree, which can be traversed, read and
decorated entirely by XSLT programs. Of course properties have to be capable of being
grounded to effective strings to attach to tree element nodes, which was possible
even at a
stretch when the type model moved to a union type. With with a suitable system of
indexed
map structures held within the analyser, and keyed through unique ids, even this restriction
might be overcome.
A syntactically coherent specification
Early on in the development it became imperative that there should be some means of
finding the correct place in the XSLT specification to examine constructs that were
being
evaluated, if only for debugging the tool itself. Whilst much of the specification
is very
richly hyper-linked internally, there was still an enormous amount of scrolling around,
losing a place, having to revert to and search through the table of contents (which
on the
author's browser occupies some 14 pages) all the while trying to retain a
mental (stack) model in one's head. Could we build a hyperlink from say an
xsl:value-of element to the relevant section of the specification?
Most fortunately, but certainly by design, many of the sections (in this case
19.8.4.37 of the December 2013 Working Draft) had hyperlink anchors which
were extremely coherent, quite fine-grained and followed the scheme
#streamability-xsl-{local-name(.)}, in this
case #streamability-xsl-value-of. The exceptions (which were often to a parent
category) could be handled by attributive declarations in a table, such as shown in
Figure 17. In the case of the General Streamability Rules, the
combination of nested lists and a table required preprocessing to bury a series of
anchor
points (and div groupings) to support display.
Equally, the existence of and access to, even more definitive documents behind the
specification, such as function-catalog.xml, meant we could latch onto and use
definitive information, avoiding transcription errors and making it possible for the
tool to
track eventual changes in some cases automatically.
Problems
Apart from the size and apparent complexity of the streamability rules themselves,
a
small number of other problems appeared including
the size of the resultant fully-serialised output
debugging the tool itself
Whilst the analysis of streamability for a modest stylesheet is relatively quick
(seconds or less), the serialisation, and particularly the browser display of that
serialisation can be lengthy. A glance at Figure 11 shows that each streaming
property takes about 100 characters to define for display in the current serialised
form,
which means each construct (instruction, expression term) takes about 1kB to display.
(The
serialisation of the example in Figure 1 is just under 14kB long.) Clearly
there could be economies in the terms used (class names are over-generous for example)
and
some server-performed compression of the results and associated CSS might improve
matters by
perhaps a factor of three. With some more active participation from the Saxon-CE stylesheet,
such as encoding/decoding property values perhaps an order of magnitude could be gained[17].
The simplicity of keeping all the analysed data on the XSLT tree and then serialising
the whole was very effective in getting the tool developed, but an alternative would
be to
split off the data just before serialisation and provide that as a separate
id-mapped data structure that the Saxon-CE stylesheet would use.
Unsurprisingly early debugging of the tool wasn't extremely easy, as the sheer volume
of
data and especially the O(100) different types of instructions and expressions made
focussing on a single problem difficult. Building the tools as a number of phases
helped
(e.g. expanding all the XPath trees, marking the focus control, calculating the static
type)
so that successive phases could be checked to mostly work[18]. However once a certain stage was reached a lot of the further development could
be carried out merely by altering or expanding tables.
Acknowledgements
Apart from funding the venture, Mike Kay set me a very interesting and what turned
out
to be a surprisingly complex challenge, but one of XSLT working on XSLT that I very
much
enjoy. Florent Georges deserves thanks for the excellence of, and excellent support
for, the
Servlex webapp delivery mechanism which made mounting the tool as a web service
comparatively plain (all XML) sailing, with no PHP or Java required!
O'Neil Delpratt was his usual cheerful self in helping get the tool installed on Saxonica
servers.
Quo vadis?
During the writing of this paper it became clear that more of the tool's behaviour
could
be derived from direct interpretation of some sections of the specification, thanks
mainly
to the coherence in its logical structure. This would increase the tolerance of the
tool to
(modest) changes in the specification. Changes to the serialisation and storage outlined
above could be helpful also.
Whilst the original intention of the tool was to enable a given stylesheet to be
analysed, perhaps its future value is really in being able to increase understanding
of what
the essence of the streamability rules is, by providing a highly detailed
walkthrough of their application on fragments of XSLT of interest. It might
also be interesting to see whether a rudimentary expert system, in the form
of patterns matching some design metaphors (e.g. adding a copy() action to
avoid returning nodes from the input) might be able to suggest alterations that might
enable
streamability where originially this cannot be guaranteed.
[Servlex] Georges, Florent: Servlex: (: Web
Applications and REST Services Framework for XQuery, XProc and XSLT. :)[online]
http://serlvex.net
[1] Applicable templates must be invoked in a mode that has been declared to be
streamable (xsl:mode name=".." streamable="yes") so the set of templates to
be examined is restricted.
[2] The author was present when the XSLT Working Group analysed 'by hand' (and
conference call) the streamability of a 5 instruction stylesheet, with XPath expressions
perhaps 4-5 terms deep. They almost managed to complete the process in about 50
minutes.
[3] We could have used xsl:sequence which in XSLT3.0 can contain a sequence
constructor (!) but placing it in a separate namespace makes the implementation
tidier.
[4] Multi-line expansion and indentation is shown for clarity, but the single line is
actually flat.
[5] At present relative indirect stylesheets (e.g. xsl:include
href="more.xsl") are not supported, as of course the server cannot
request from the client file system, though 'web-accessible' links could be
followed. A system where all the stylesheets were web-accessible could be
developed easily.
[6] It didn't quite appear that the table was itself regular enough to derive this data
from it automatically, but perhaps I should have persisted.
[7] Within our expanded XSLT trees all references will be through elements such as
VarRef - even text-value-templates will have been
expanded into element trees.
[8] Whilst this of course can be processed using a recursive template, using
xsl:iterate introduces much more coherence in what is essentially a
contained tail-recursive iteration. Using a high-order function such as
fold() isn't terribly practical when XSLT instructions predominate.
Equally the immutable map() of XSLT3.0 makes tracking variable scoping
vastly easier than alternative methods.
[9] It is tempting to see whether the table Classifying Calls to Built-In Functions in the specification is regular
enough that the usage can be extracted automatically. On the other hand one can argue
that fundamental properties of the function, such as usage, belong in the definitive
catalog.
[10] It would be comparatively trivial to parse that map from an even simpler
representation, possibly even from the specification itself.
[11] Mutually interacting streamable modes are not supported in this tool.
[12] For large stylesheets 'top-level' constructs (e.g. xsl:template) could
of course be defined to display in a folded state. The streamability of such entities
would still be visible as red/green backgrounds of course.
[13] Under some arrangements with all being displayed in a single tab/window, it is
possible to enrich the explanation of application of the General Streamability
Rules, by highlighting all the rules that were relevant to a particular case. This
requires i) div grouping of sections of the specification GSR (which
can be done automatically) and ii) altering the display properties of these
div sections. Unfortunately this appears not to be possible (as one
would wish) between different tabs or windows within a browser....
[14] A critical requirement might be that referential mechanisms and dependencies (e.g.
variables) follow descendant or following-sibling scoping.
[15] Actually they are permitted to extend the cases in which they can stream, but they
must support cases which are guaranteed streamable according to the
specification rules.
[16] To be fair, Mike Kay had suggested REx as a possibility in the
initial project outline, but the author had had some experience, before working with
Saxonica, on using Expression.explain(), so that was in the first design.
[17] Technically we might be able to implement the analysis tool in XSLT2.0 and hence
consider a total Saxon-CE in-browser solution, but the use of XSLT3.0 facilities
(especially mode declarations and maps, as curiously HOFs were confined to a very
small
number of cases) makes the development very much more straighforward. Of course anyone
using streaming must be using XSLT3.0 anyway.
[18] The main phases expand / focus control / static type / usage / posture & sweep
were pretty separable until the new U-type model introduced usage-dependent
type to complement the existing type-dependent usage . At this
point some sections of the type and usage phases had to be merged.