Introduction
XForms [XForms 1.1, XForms 2.0] is a declarative XML-based programming language for defining applications on the web and elsewhere. It is a W3C standard, and in worldwide use, for instance by the Dutch Weather Service, KNMI, national government websites, the BBC, a US Department of Motor Vehicles, the British National Health Service, and many others. Largely thanks to its declarative nature, experience has shown that you can produce applications in much less time than with traditional procedural methods, typically a tenth of the time [Introduction to XForms].
Invariants
An essential mechanism within XForms is for invariants: relationships are specified between values, such that if a value changes, for whatever reason, its dependent values are changed to match. This may then generate a chain of changes along dependent values.
This is very similar to the basic calculation mechanism in spreadsheets, with one major advantage: it is much more general.
An example that illustrates the advantages of this generality of the invariant mechanism is the XForms Map application [Map]. This is a Google-maps slippy-map style of application, that displays a map of anywhere in the world, and allows you to pan and zoom using the mouse. It is a surprisingly small number of lines of XForms code, about 90 in its simplest incarnation, with the startling property, at least startling for a procedural programmer, that it contains not a single while statement.
It was Dijkstra in his seminal book A Discipline of Programming [Dijkstra] who first pointed out that the principle algorithmic purpose of a while statement in programming is to maintain invariants. With this view in mind, it is then less surprising that if you have a mechanism in a programming language that automatically maintains invariants, the need for while statements goes away.
An example of the use of invariants
To get a taste for why and how this is, let us briefly examine the mechanisms used for implementing the map example.
The highest-level abstraction of the application is that there is
a two-dimensional position of a location in the world, (x, y)
,
and a value that represents the magnification or zoom value required;
the application then retains a display of a map centred at that location.
The implementation uses a supporting service that delivers (square) map tiles that represent a map region covering a set of locations at a particular zoom level:
http://<site>/<zoom>/<x>/<y>.png
The coordinate system for the tiles is as a two-dimension array. At the outermost level, zoom 0, there is a single tile
at the next level of zoom, 2×2 tiles,
Table I
at the next level 4×4, then 8×8, and so on, up to zoom level 19. So it is a fairly simple arithmetical calculation to work out which tile you need for a location in world coordinates: you work out for the given level of zoom how many locations are represented by each tile, and divide.
scale=226 - zoom
tilex=floor(worldx ÷ scale)
tiley=floor(worldy ÷ scale)
In XForms:
<bind ref="scale" calculate="power(2, 26 - ../zoom)"/> <bind ref="tileX" calculate="floor(../worldX div ../scale)"/> <bind ref="tileY" calculate="floor(../worldY div ../scale)"/>
Note that these are not assignments as in procedural languages,
but invariants -- statements of required equality: if
zoom
changes then scale
will
be updated. If scale
or
worldX
changes, then tileX
will be updated, and so on.
Once you have the coordinates of the tile you need, is is equally trivial to calculate the URL necessary for the tile to be delivered:
<bind ref="tile" calculate="concat(../site, ../zoom, '/', ../tileX, '/', ../tileY, '.png')"/>
and displaying it is as simple as
<output ref="tile" mediatype="image/*"/>
If you need more tiles to get a larger map, it is obviously easy to calculate the indexes of adjacent tiles.
So displaying the map is clearly easy, and uses simple invariants.
Making the map pannable uses the ability of keeping the mouse coordinates and state of the mouse buttons as values: then it is only a question of specifying that when the mouse button is down, the location of the centre of the map is the sum of the current location and the mouse coordinates. As this value changes, so the map display is (automatically) updated to match.
Advantages
The advantages of the invariant approach are many and include that you can far more easily specify the computational intent of a piece of program, and so the computational intent is therefore much more obvious to the reader of the code; the computer does more of the (administrative) work, saving the programmer time; the resulting code is much shorter (typically a quarter of the length); and production time is greatly reduced: reports of applications needing only 1/10th of the time have been widespread, but some even more than that.
Implementation
The XForms invariant mechanism uses is a fairly straightforward ordered-graph based dependency algorithm [Model Processing, Recalculation], ordered since XForms disallows circular dependencies; static analysis can be used for large parts, though dynamic dependencies are also permitted. Updates are triggered by changes to dependent values, and updates are then ordered along the dependency chain and applied. For example, the graph of dependencies of the map application fragment above looks like this:
Initially the system is at stasis: all values are initalised, and all invariants are up-to-date. Whenever a change occurs, whether caused by the user, or by the system as a response to an event, an update occurs, in order to return the system to stasis.
The update mechanism has 4 stages:
-
Rebuild: If there have been structural changes in the data (elements or attributes inserted or deleted), the graph of dependencies may have changed, and some invariants may need to be added, deleted, or adapted accordingly.
-
Recalculate: Whether or not anything happened during rebuild, new values (and related properties) are calculated based on any changed values.
-
Revalidate: Any new values are checked for validity (this step is not of further interest for this paper),
-
Refresh: The display is amended with the new values.
One of the reasons that the XForms dependency graph is so useful is that if a value changes, only dependent values need be recalculated. Initially (at least in principle), all values are calculated, but after that, changes only cause a subset of the values to be updated. Later we will be referring to the update efficiency, how much work has to be done to restore an invariant.
Events
Although the update mechanism is automatic, there are opportunities to hook into it, in order to deal with special cases not dealt with automatically, for instance setting up initial dynamic values on initialisation:
<action ev:event="xforms-ready"> <setvalue ref="today" value="local-dateTime()"/> </action>
This uses the XForms event mechanism
[Events], that allows different
classes of processing events to be caught and responded to,
such as xforms-ready
in the example above.
Limitation
The XForms invariant mechanism has one main limitation: it can only make changes to simple content, values that can be calculated with a simple expression. To make structural changes, you have to use the event mechanism.
The events of interest here are twofold:
-
when the content of an element or attribute changes,
-
when items are inserted or deleted causing structural changes to an instance.
Handlers can listen for events that report these changes and respond in some way.
As a simple example of such a structural change, suppose we have an array of values that has to be of a certain size, defined by an attribute on the root element.
<instance>
<data size="10" xmlns="">
<value/>
</data>
</instance>
To initialise the array, the xforms-ready
initialisation event is caught, to ensure that there are the
right number of elements:
<action ev:event="xforms-ready">
<insert ref="value" while="count(value) < @size"/>
</action>
(The <insert/>
action by default
duplicates the last element in the nodeset referenced). As a
result at start up there are the right number of elements.
However, if size
changes during processing,
the number of elements has to be changed, either increased or
decreased. The way to do that is to catch value
changed events for that value:
<action ev:event="xforms-value-changed"> <insert ref="value" while="count(value) < @size"/> <delete ref="value[last()]" while="count(value) > @size"/> </action>
This inserts elements if there are less than the required number and deletes elements if there are more.
(It should be remarked in passing that while
was perhaps a poor choice of name for the guard attribute, since
it could mislead the reader about its intent. It is a guard to
the action and so a name like suchthat
might
have better reflected its intent.)
An alternative to doing the deletes is to use
relevance. This leaves the elements
physically present in the instance, but excludes them from
availability in the user interface, saving repeated insertions
and deletions if size
changes often, and only
requiring actual changes when the new value of
size
is larger than any previous.
<bind ref="value" relevant="position() < @size"/>
A similar approach can be used for two-dimensional arrays, by first growing a row, and then duplicating that sufficiently:
<instance> <game size="10" xmlns=""> <row><c clicked=""/></row> </game> </instance> <action ev:event="xforms-ready"> <insert ref="row/c" while="count(//c) < /game/@size"/> <insert ref="row" while="count(//row) < /game/@size"/> </action>
Structural Invariants
These examples show that you can use events in order to maintain your own invariants. The question arises, to what extent, and how, could invariants that affect structure be introduced into the invariant mechanism, without having to resort to such hooks?
The first thing to note is that the existing XForms invariant
bindings don't state the invariant, but only how to restore it,
and that the calculate
attribute contains both
the signals that indicate that the invariant needs restoring, as
well as the method to restore it.
<bind ref="tileX" calculate="floor(../worldX div ../scale)"/>
On the other hand, in the two structure examples above, the signal (the value changing) is separate from the restore mechanism (the inserts and deletes). This is possibly a clue to how the mechanism could be integrated. Here is a strawman suggestion:
<structure requires="count(value) = @size"> <insert ref="value" while="count(value) < @size"/> <delete ref="value[last()]" while="count(value) > @size"/> </structure>
This allows the boolean requires
attribute to
become part of the dependency algorithm, and the body of the
structure
element to become part of the update
mechanism, namely the rebuild part.
Requirements
While this solves a simple case of structural invariants, there are more complex ones.
XForms already allows you, for instance, to repeat over a filtered set of values. For instance displaying the data that matches a search string:
<repeat ref="data[contains(., instance('search')/q)]">
But apart from repeating controls over a subset of data, you would also like to be able to say this structure contains the data that matches the search string. Again a strawman:
<structure ref="subset" calculate="data[contains(., instance('search')/q)]"/>
or alternatively
<bind ref="subset" structure="data[contains(., instance('search')/q)]"/>
so that whenever the calculation values change, the subset, including its size and values, gets updated.
This looks at first like it would be horribly inefficient, but it is important not to throw the baby out with the bathwater: what we are asking with requirements analysis is "what would be useful to be able to do?". We shouldn't prematurely optimise by saying it would be inefficient, thus not a good solution; we should ask what we want to do, and then later work out how to implement it efficiently.
To give two successful examples of this approach in the past: early programming languages did not support recursion; during the design of Algol 60, the designers came to the conclusion that they needed recursion in order to effectively express algorithms. So they added it, even though at that time they didn't know how to implement it at all, let alone efficiently. In fact a long war waged after that about whether recursion was really needed, and whole books were written on the subject. Today it would be unthinkable for programming languages not to support recursion.
The second example also comes from the Algol family. Early programming languages only allowed you to assign and copy simple values, basically values that would fit in a register. During the design of Algol 68, they decided to ignore the type of item involved when doing assignments. They argued that assignment was an abstraction; if that meant you wanted to copy a whole array, that was up to you. Again, it seemed inefficient to traditional programmers, but in fact led to the invention of new ways to store and copy data [Hibbard], methods that continue to be used today (for instance in Python).
Higher-level functions
While this paper isn't the place to enumerate all possible data structures and the invariants that might be applied to them, it is useful to explore some general invariants over lists, since dynamic data structures using lists are so inherent to programming, and since they form the basis of so many other structures.
Useful structural invariance functions apart from filter, are map, reduce (or fold), and sorting. These specify very high-level relationships between structures, thus allowing the specification of algorithmic intent very easy.
XForms already has an inherent map ability, though not expressed as such, since an invariant like
<bind ref="value" calculate="..."/>
does the calculate for every element matched by the nodeset in
the ref
.
It also has some specific versions of reduce, such as
sum()
, but not a generalised one. However,
the most-recent version of XPath has added these higher-level
functions, which allows them to be used in new versions of
XForms.
Implementation
The update mechanism sees the collection of invariants as a tree of dependent calculations only some of which fire at each iteration. If you compare that with structural invariants like filter, map, and reduce, and regard them not as atomic calculations, but as an assemblage of sub-calculations, then in fact they fit very well into the recalculation mechanism.
For instance, a map is just a sequence of sub-expressions each calculating one function application for each element of the sequence of input values (along with a structural guard on the size).
Note that this has an update efficiency of O(1): if a value changes, only its dependent value needs to be updated.
Similarly, a reduce can be seen as a tree of reductions. XPath specifies two reduction functions, fold-left and fold-right. Unfortunately, at least for XForms, these are computational very specific: you start the reduction either from the left or the right. But in fact many typical reductions, such as sum, product, and concatenate, are directionally neutral, since their underlying dyadic operators (namely +, ×, and string-join) are associative: a+(b+c) = (a+b)+c, and we can take advantage of this.
For instance, if we regard a fold-left as a graph of dependent calculations, it would need to be implemented like this:
which has an update complexity of O(n). On the other hand, a directionally impartial fold could be implemented as
with an update complexity of only O(log(n)). (Note that both versions have a space complexity of O(n).)
This is a big difference for large lists: while doubling the size of the list doubles the number of computationd for O(n), it only adds one single extra computation for O(log(n)). Put another way, it involves 100-fold less computations for a list of length 1000 (10 calculations versus 1000) and 50 thousand-fold less for a list of length one million (20 calculations versus 1 million).
This is because if a value changes in the middle of the tree or sequence, the other parts of the assemblage don't have to be recalculated.
The upshot of this is that you can regard structural invariants as just a higher-level form of invariant, ones whose purpose is to manage the collection of lower-level simple invariants; the rebuild phase of the XForms update mechanism can be treated as a higher-level version of recalculate.
A description of such an implementation can be found in [Views].
Conclusion
The advantage of invariants for the programmer is that they state at a high level the intent of the computations, and leave a lot of the administrative detail to the computer. This is how it should be, and is part of a historical movement in computing, handing off more of the work to the computer.
Structural invariants can in fact be seen as a higher level version of the simple invariants found in XForms, where the work of the invariant is rebuilding networks of lower-level invariants. In such a way, structural invariants can be merged into the general invariant recalculation mechanism.
The purpose of this paper has been to initiate discussion on generalising the XForms invariant mechanism to these higher-level invariants, as input for a future version of XForms.
The purpose of this paper has been to initiate discussion on generalising the XForms invariant mechanism to these higher-level invariants, as input for a future version of XForms.
References
[XForms 1.1] John M. Boyer (ed.), XForms 1.1, W3C, 2009, https://www.w3.org/TR/xforms11/
[XForms 2.0] Erik Bruchez, et al. (eds.), XForms 2.0, W3C, 2021, https://www.w3.org/community/xformsusers/wiki/XForms_2.0
[Introduction to XForms] Steven Pemberton, An Introduction to XForms, XML.com, 2018, https://www.xml.com/articles/2018/11/27/introduction-xforms/
[Model Processing] Model Processing, in [XForms 2.0], https://www.w3.org/community/xformsusers/wiki/XForms_2.0#Model_Processing
[Recalculation] Recalculation Sequence Algorithm, in [XForms 2.0], https://www.w3.org/community/xformsusers/wiki/XForms_2.0#Recalculation_Sequence_Algorithm
[Dijkstra] EW Dijkstra, A Discipline of Programming, Prentice Hall, 1976, ISBN 0-13-215871-X.
[Map] Steven Pemberton, Live XML Data, in Proc. XML London 2014, pp 96-102, ISBN 978-0-9926471-1-7, https://xmllondon.com/2014/xmllondon-2014-proceedings.pdf
[Events] Shane McCarron et al., XML Events, W3C, 2003, http://www.w3.org/TR/xml-events
[Hibbard] P.G. Hibbard, P. Knueven, and B.W. Leverett, A Stackless Run-time Implementation Scheme, Proc. 4th International Conference on the Description and Implementation of Algorithmic Languages, pp.176-192 (1976).
[Views] J. Ganzevoort, Maintaining presentation invariants in the Views system, Report CS-R9262, CWI Amsterdam, 1992, https://ir.cwi.nl/pub/5342/05342D.pdf