Al-Awadai, Zahra, Anne Brüggemann-Klein, Christina Grubmüller and Philipp Ulrich. “Graphical user interfaces in the X stack.” Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 - August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). https://doi.org/10.4242/BalisageVol23.Bruggemann-Klein01.
Balisage: The Markup Conference 2019 July 30 - August 2, 2019
Balisage Paper: Graphical user interfaces in the X stack
“XML Everywhere” isn't just a slogan: it actually works, up and down the XML
application stack. Recent developments, such as the inclusion of custom elements
in
HTML5, allow the declarative approach of XML to come into the browser/server
interaction. XForms, supported by SVG and CSS, can serve as the basis for a
graphical user interface. A custom WebSocket element can support client-to-client
and server-push communication of XML data. Applications of State Chart XML (SCXML)
mean that the “XML Everywhere” approach can be extended all the way to models
of
operations in an application. Interactive games offer living proof of the
stack.
As we have claimed before [B16,ABCES17], current XML technologies provide a full stack of modeling
languages, implementation languages, and tools for web applications that is stable,
platform independent, and based on open standards. A particular strong point of
what we
call the X stack is that data are encoded with XML end-to-end and that XML technologies
can be used where-ever XML data need to be processed.
We are interested in the X stack for web applications for three reasons. First, its
practices and techniques support development processes that are driven by models,
particularly by domain models [E04]. This is relevant in the
context of a research agenda of generating instances of serious games and learning
analytics. Second, the complete X stack serves as a vehicle to teach XML technology
to
students through the backdoor of engaging web applications such as games. The austerity requirement we impose on students in a lab course
on XML technology, namely to implement a game on the web with XML technology alone,
no
JavaScript or frameworks allowed, forces students to become proficient with SVG,
XQuery,
XSLT and other XML technologies. It also reinforces their knowledge of software
engineering principles and teaches basic web application architecture that is not
clouded by some specific and potentially short-lived framework. Third, the practices
we
suggest with the X stack pave the way for XML experts to develop XML-based applications
on the web.
Graphical user interfaces (GUIs) are crucial components of applications with which
users interact directly. They present information about the state of the domain
entities
in the application and provide methods of interaction to manipulate them. In this
paper,
we investigate a spectrum of GUI technologies in web applications and how they
fit into
the X stack. We explore a number of technologies that independently address some
aspect
of a GUI in the X stack.
Throughout the paper, we provide code to illustrate the techniques that we introduce.
We have also applied our principles and strategies in a case study, the game Guess
the
Number (GN). which is documented in a separate document that is available on request.
This case study is intentionally kept simple, so that we can focus on principles
without
being distracted by more complex XML processing. We have student projects from
the XML
Technology Lab at TUM for Tic-Tac-Toe, Scissor-Paper-Rock, Blackjack, Memory, Mancala
and the early GameX [SKB14] that follow the same principles as
they evolved and that are technically more complex. The case studies demonstrate
how
end-user developers who are conversant with XML technologies can create their own
web
applications.
This paper is organized as follows:
After this introduction, section “Requirements for GUI technologies” briefly describes the
responsibilities of a GUI component before honing in on our two main requirements
for
GUI technologies in the X stack that fit into a model-driven approach. The two
requirements are that the GUI and the application core must communicate through
XML-encoded declarative data and that the GUI component itself must be defined
in a
declarative manner.
We then investigate a number of GUI technologies that contribute to the two main
requirements.
First, in section “XHTML with XForms and SVG”, we discuss XForms in the context of HTML and
SVG. XForms is the prototypical GUI technology for our first requirement. We also
investigate to which extent it supports the second requirement and how it compares
to
HTML forms.
One topic that has always been and still is present when discussing GUI technologies
is a component-based approach that allows for composition and reuse of GUI
components. The Web Components specification [WC19] brings the component idea to HTML by enabling web
developers to define their own reusable custom HTML elements as components that
are
marked up like regular HTML elements and that encapsulate custom behaviour and
style. In
section “Web Components”, we present custom elements and their potential
with respect to our main requirements. This prepares a later discussion on a custom
element that we have developed called WebSocket Element that enables GUIs to handle
WebSocket communication in the X stack for multi-client systems.
Modern web applications, particularly games, are often multi-client systems.
Multi-client applications require communication patterns that let clients talk
to each
other or that allow a server to push messages to clients without prior requests,
as
supported by the HTTP extension protocol WebSocket. Multi-client applications require
both servers and clients, read GUIs in our context, to support such a protocol.
In section “Server support for multi-client applications”, we briefly present work that supports the
WebSocket protocol and that recently was taken up and extended by the team at BaseX.
The
main contribution of this paper, in section “WebSocket Element”, is to
present the client side of the WebSocket equation. This is a declarative custom
element
according to the Web Components specification WC19 named
WebSocket Element that can be used in an HTML-based GUI just like a built-in HTML
element. The WebSocket Element encapsulates code to initiate a WebSocket connection
and
to handle incoming declarative XML data that control the GUI. In the definition
of a
GUI, the WebSocket Element in its HTML surface form just declares the parameters
for a
WebSocket connection and how to handle XML data that arrive through the connection.
Handling the incoming data can mean just to render HTML or SVG data or it may involve
applying an XSL transformation and rendering the result. The WebSocket Element
demonstrates how our two requirements for GUI technologies can be supported in
the X
stack for a multi-client web application.
GUI components are commonly considered to be event-driven systems whose functionality
is triggered by events, mostly from user interactions. The classical tool to model
the
behaviour of an event-driven system is statecharts. Statecharts have arrived on
the XML
scene through the encoding language SCXML and a number of SCXML processors. In
section “Statecharts and SCXML” we present the integration of the Apache Commons SCXML
Interpreter, which is implemented in Java, into XQuery modules that implement
event-driven applications that are run in BaseX. We expect to adapt that work so
that we
can integrate JavaScript SCXML interpreters into GUIs. That would contribute to
our
second requirement of having a declarative definition of GUI components in the
X
stack.
We conclude the paper with a number of discussion points and some final
remarks.
Requirements for GUI technologies
The general task of a GUI is to represent application data or domain entities, to
provide ways for the user to interact with this information and to save, on user
request, changes that were made in the course of the interaction. In the web context,
we
consider GUI components that run in a web browser and whose communication with
the
application core is mediated by a web server via HTTP or via HTTP extensions such
as
WebSocket.
Our reference architecture for web applications in the Model-View-Controller (MVC)
architectural style [ABCES17] has a one-to-one relationship
between the Model and Controller components. The Model provides an API that is
only used
by the Controller. There are no outside influences on the Model. State changes
in the
Model are only triggered through API use by the Controller. The Controller has
its own
API that is used by any number of View components. The View components, which run
in web
browsers, and the Controller, which runs as an XQuery module in BaseX, communicate
via
HTTP through a mediating web server: On receiving an HTTP request from a View,
the web
server triggers a RestXQ-annotated method in the Controller module and sends the
return
value of that method call back to the view as an HTTP response.
For the purposes of this paper, a GUI is a View component in the MVC architectural
style, and the Controller and Model together form the application core.
Our first and central requirement is that the GUI and
the application core exchange declarative data that are encoded in XML.
In the life cycle of interactions between the GUI and the application core, initially
the GUI receives from the application core over HTTP declarative XML-encoded information
about the data that it needs to present and about the interactions it needs to
offer.
While the user interacts with the GUI, the GUI may signal user events to the application
core and receive declarative data again. In a multi-client system, it may also
receive
push data from the core application without prior request. Eventually, the user
closes
the GUI with a final interaction that the GUI may also signal to the application
core.
The central requirement leaves many options open: The communication between GUI and
core application can be synchronous or asynchronous. It may follow the request-response
cycle of HTTP or allow for push data from the application core over HTTP extensions
such
as WebSocket. The GUI may integrate the information it receives into its current
display
like in a single-page web application or it may build a completely new display.
The GUI
may be a simple form-based interface that collects user input or it may have elaborate
displays and interaction methods and perform its own computations on user request
in
Rich Internet Application (RIA) fashion.
In this paper, we focus on GUIs that follow the WIMP (Window, Icon, Menue, Pointer)
paradigm with discrete actions. For a discussion on WIMP and Post-WIMP styles see
a
paper by van Dam [D97].
The GUI as a system is responsible for constructing a visual representation of the
data it receives, to render that representation onto a canvas within the boundaries
of a
viewport, and to handle general as well as application-specific user interactions
such
as resizing the viewport or scrolling as well as form field entries or button clicks.
Finally, it needs to handle the communication to the application core. As to
computations within the GUI, they range from input validation, formatting and
interaction support to arbitrary computations that are part of the application.
As our second requirement, we only consider GUIs that
can be defined by configuration or that can be programmed in a declarative manner.
Towards that requirement, component technologies for GUIs are particularly
promising.
To illustrate, let us look at a simple case study that we have used before, the game
Guess the Number (GN). The game GN has two types of actors: Player and Game. Upon
start
of a game, Game thinks of a secret number (secret) between 1 and some upper
bound (range). Player guesses repeatedly what the secret number is and
receives feedback from Game whether the guess is high, low or correct. There is
a limit
to the number of guesses allowed (maxGuesses) that depends on
range. Player wins if they guess the secret number correctly within the
maximal number of guesses allowed; Game wins otherwise. There are no ties.
We model the information that the GN GUI receives from the GN application core in
the
following UML class diagram. Specific information instances are translated into
a
canonic XML encoding. The debugging section of the GUI screenshot in Figure 4 demonstrates the XML encoding.
The type attribute with one of the values welcomeScreen, firstGuessScreen,
furtherGuessScreen, resultScreen and goodByeScreen informs the GN GUI component
implicitly which screen to display, which interactions to offer and which requests
to
send back to the GN application core on user request. This domain model for the
GN GUI
is defined in the following table.
Figure 3: Domain model for GN GUI
Table I
Screen type
Information
Interaction
Request
guessTheNumber
welcomeScreen
fill in range
newGame[range]
submit
firstGuessScreen
id
fill in next guess
guess[id,guess]
guessesSoFar (static, 0)
submit
maxGuesses
range
furtherGuessScreen
id
fill in next guess
guess[id,guess]
guessesSoFar
submit
maxGuesses
range
evaluation last guess
resultScreen
id
play again
playAgain
guessesSoFar
quit
quit
maxGuesses
range
evaluation game
secret
goodByeScreen
In the simplest case, the GN GUI just offers input fields for entering data and
buttons to indicate user choice. We can also imagine that a richer GN GUI keeps
track of
the history of user guesses and advises the user on a guessing strategy.
In the next few sections, we discuss a number of GUI technologies that support our
main requirements in a number of ways.
XHTML with XForms and SVG
Let us start out by reporting on a group of well-known technologies and discuss how
they stack up against our requirements.
XForms is the classical GUI technology that supports XML-encoded data representation
and exchange as demanded by our first requirement. An XForms GUI is controlled
by XML
data in instances within an XML-encoded model component. GUI widgets bind to XML
elements within the instances as defined by XPath expressions. They can be used
to read
and write element data.
An XForms model expresses type constraints for instance data and defines derived
values via XPath. It also defines activities that are triggered by user or system
events. Most importantly, it configures submissions that are triggered by submit
widgets
in the GUI. The configuration declares which parts of the instances to submit to
which
service using which HTTP method; it also defines what to do with the data that
are
returned after the submission and how to handle errors. A typical XForms submission
transfers part of XML-encoded instance data in the body of an HTTP request and
replaces
instance data in AJAX fashion with the XML-encoded response data. Hence, XForms
satisfies our first requirement.
XForms also supports a declarative definition of the graphical and dynamic side of
a
GUI. XForms widgets are declarative XML-encoded components that are bound to elements
in
instances via XPath, as we have mentioned. The XForms processor handles the data
exchange between widgets and instance data, resolving any dependencies. It also
performs
input validation as defined through XPath expressions and XML Schema data types.
XPath
widgets have a uniform and declarative system for hints and labels. They each have
a
clearly defined presentation-independent functionality, such as accepting a typed
input
value, selecting a menu item or triggering an event. XForms widgets approximately
cover
the range of HTML form elements.
The widgets are hosted by an XHTML document and can therefore be placed and styled
with HTML and CSS. Since HTML5, the HTML host document may include SVG code that
can
also position and style XForms widgets as foreign objects.
We have implemented a GUI for GN as a single XHTML page with embedded XForms
components.
The XForms model holds in its main instance the current screen type and the
information for the current screen as specified in Figure 2.
In two separate instances, it holds the information that needs to be edited in
the
screen and transferred to the server as detailed in Figure 3. There is one separate instance to fill in the
range and another one to fill in the next guess. The latter copies the id of the
current
game from the main instance since that needs to be retransmitted back to the Controller
component, which is stateless and handles any number of games concurrently. The
copying
accommodates the fact that an XForms submission can only submit data from a single
instance.
The XForms model also defines all submit actions that GN requires. A submit action
triggers a GET or a POST HTTP request for static or dynamic requests, respectively.
A
POST request submits the appropriate instance in the body of the request. Each
response
replaces the main instance with the HTTP response data.
In effect, the GN application core sends XML elements to the GN GUI that describe
the
data that specify the type of screen and the information that the GUI is supposed
to
display next. Again, see Figure 2 and Figure 3 for clarification. The specific information that
is to be displayed for each screen type is specified in the domain model for component
the GN GUI in Figure 3.
The body of the XHTML page holds a section for each screen type with XForms widgets
that interact with the XForms model. Information about the current state is displayed
in
a table using XForms output widgets; user input is accepted through XForms input
widgets
and buttons that trigger XForms submissions. Only the screen type area that matches
the
main instance's current screen type is visible. The XForms model has a helper instance
with a CSS attribute "display: none" that is dynamically read into each section
that is
inactive.
A less tabular and more graphical GUI for GN uses XForms widgets linked to the same
XForms model and includes them into an SVG graphic. The widgets are included into
the
SVG code as HTML-encoded foreign objects that can be styled through CSS and positioned
and transformed through SVG. In this variant of the GUI, there are no direct
representations of the conceptual screens. Instead, the widgets themselves know
when to
present themselves depending on the information in the XForms model.
Below, we include a screenshot of the two versions of the GN GUI side by side.
Let us summarize our experience with XForms.
First, XForms fully satisfies our first requirement that the GUI and the application
core exchange declarative data in XML format.
Second, XForms satisfies the second requirement that the GUI itself can be defined
declaratively up to a point.
Due to the set of widgets that XForms offers, XForms GUIs are restricted
to form-based interfaces.
The XForms widgets that a GUI uses are defined in a declarative way, which
includes binding to the XML-encoded instance data. Their functionality is
supported by the XForms processor.
The positioning and styling of XForms widgets can be done in a flexible
and declarative way using HTML, CSS and SVG.
There is an annoying limitation for data access within the XForms GUI:
Instance data, when presented in the GUI, are always wrapped into XForms
widgets. They are not directly part of the HTML. That means that, for
example, the arched text in the graphical GN GUI is literal text in the SVG
code that is displayed on some condition in the XForms instance data. It
cannot, as far as we know, be taken directly from XForms instance data, so
that it can be typeset along the arch by SVG.
Finally, tool support for XForms is adequate but not ideal. HTML browsers
do not support XForms natively. We are using XSLTForms, which depends on
XSLT 1 support in browsers. It is reliable and supports most if not all
features of XForms. It does not appear to be in active development, and
browser support even for XSLT 1 is not guaranteed. A newer option is the
XForms processor that is written in SaxonJS and that only depends on the
ever-present JavaScript support in browsers.
We briefly contrast use of XForms with use of HTML forms in the context of HTML and
SVG. On the surface, XForms and HTML appear similar since they have similar sets
of
widgets. And HTML forms as part of HTML have great browser support. HTML pages
can
accept XML data in an HTTP response and can display them, styled by CSS, for example
in
a frame. The crux, however, is that HTML forms need to use JavaScript to bind to
these
data for editing or for submission. Hence, the pure combination of HTML, HTML forms
and
SVG completely fails our first requirement for GUI technologies and falls short
of the
second one in central aspects.
Compositional and reusable components are a promising idea in GUI development that
have found their way into HTML. It is common practice to use
arbitrary XML elements in an HTML context. Current browsers classify such elements
as
HTMLUnknownElement, insert them into the DOM and format them as inline elements
like
span elements. They even apply CSS styles to them. The Web Components
specification [WC19] builds on that practice by
classifying custom elements as "proper" HTMLElement objects and by extending the
behaviour of such elements, thus turning them into real components.
The Web Components specification enables developers to define custom elements with
custom attributes that are used just like built-in elements in an HTML document
and that
are treated just like build-in elements by HTML browsers. They are seamlessly integrated
into the DOM and available to JavaScript code through the DOM HTML API. They can
be
styled with CSS, observed by event listeners, go in and out of focus according
to
keyboard events etc. The real innovation is that custom elements can have their
own
custom behaviour that is defined by JavaScript. They are also capable of encapsulating
their own data and style through a shadow DOM. A custom list element for a todo
list,
for example, can offer ways to tick items or to collapse and expand sublists.
On the simplest level, a custom element can just expand the HTML vocabulary, as
demonstrated in Figure 5. [TODO: Change class name to
Todo_List and element name to todo-list.] The thus-defined custom element todo-list
behaves just like a span element but has semantic meaning built into its name.
Behaviour
and style are added by extending the class of the custom element with lifecycle
functions.
Obviously, in the context of GUIs new widgets can be defined as custom elements. It
is
even conceivable to define a system of widgets that are bound to XML data, reinventing
XForms.
Ulrich in his Bachelor Thesis [U18] has defined
a custom element ws-stream for HTML GUIs that establishes a WebSocket connection
to a
server and then handles XML data that are pushed over that connection, thus supporting
our first requirement for GUI technologies. We summarize this work in section “WebSocket Element”.
As a component approach, custom elements within the HTML Components context certainly
support reuse of components. Custom elements can also be composed from lower-level
components. They are lacking a system of parameter passing and communication, though,
that is the hallmark of composition support in React.
Server support for multi-client applications
Modern web applications, particularly games, are often multi-client systems.
Multi-client applications require communication patterns that let clients talk
to each
other or that allow a server to push messages to clients without prior requests.
The
HTTP extension protocol WebSocket supports these patterns. It has been identified
as the
best current technology for these purposes with respect to support and
functionality [C17].
Multi-client applications require both servers and clients to support the new
protocols. In this section, we address WebSocket support for BaseX, the server
system
that we use in our projects.
In previous work [ABCES17] we have outlined how to
integrate server-push into the X stack, based on a modified form of BaseX that
was first
presented by Conrads [C17]. The concepts and
implementations were later refactored and better integrated into the BaseX server
by
Ulrich [U18] who also proposed a client-side
solution as a counterpart to the server. As thesis work at University of Konstanz,
Finckh [F18] collaborated with the BaseX
team to integrate a WebSocket implementation into the BaseX production system.
Today, BaseX natively supports WebSocket with RestXQ-like annotations to react to
different WebSocket events (onConnect, onMessage) on the server side. The BaseX
documentation [B19] describes the usage and
application of the new annotations and the new WebSocket XQuery module used to
send
messages or set WebSocket parameters.
The WebSocket protocol is low-level with little ex-ante support for commonly required
features. Hence, it has been extended to STOMP, which supports channels and explicitly
defines message formats. STOMP support is part of a BaseX development version that
has
not been officially released yet.
In section “WebSocket Element”, we discuss a new HTML component for GUIs
called WebSocket Element that handles the client side. The WebSocket Element can
interface with any server component that supports STOMP over WebSocket. Our demo
applications use the BaseX development version that supports STOMP over
WebSocket.
WebSocket Element
Requirements
A GUI that participates in a multi-client web application needs two capabilities:
First, a method to initiate a connection with a server through WebSocket. Second:
a
way to receive and process data through this connection from the server.
The HTML 5 WebSocket API provides these capabilities through JavaScript code.
Ulrich [U18] encapsulates these tasks in
a new, purely declarative HTML component that he calls WebSocket Element. A
WebSocket Element in an HTML page looks just like a built-in HTML element that
is
configured through attributes. WebSocket Element is defined, however, as a custom
element in the Web Components framework that was introduced in section “Web Components” and, hence, has interesting behaviour.
The abstract idea of a custom HTML element that wraps the client side goes back to
the Master Thesis of Conrads [C17]. Custom
elements as defined in the Web Components specification turn out to be the perfect
fit to implement this concept. The implementation defines the functionality of
the
WebSocket Element with JavaScript and uses the WebSocket protocol to allow
synchronous bidirectional communication. In fact, our implementation uses the
STOMP
protocol with its predefined message formats and channel concepts. After initiating
the bidirectional connection with a server on load of the HTML page, the WebSocket
Element can then receive declarative data in the form of XML to which it can apply
its own XSL transformation or it can receive and render SVG or HTML data that
were
generated by the server. Hence, the WebSocket Element contributes to our two main
requirements for GUI technologies in a multi-client scenario.
Basic WebSocket Element
In its most basic form the WebSocket Element looks like the following:
Behind the scenes, the WebSocket Element is a custom element as explained
in section “Web Components”]. The attributes configure the functionality
of the custom element and are used by its JavaScript implementation. It has an
id
like other HTML elements which allows us to identify and style it with CSS or
to
dynamically add and remove WebSocket Elements to and from the DOM through
JavaScript. Furthermore, the WebSocket Element needs to know to which location
the
WebSocket connection should be established to. The attribute url lets us specify
the
server address. To separate different applications and to create channels within
a
web application the subscription attribute can be set to a path to which the
WebSocket Element automatically subscribes after the connection is established.
It
then listens to WebSocket messages on the subscribed paths or channels and inserts
the data it receives into its own content. Since the page doesn't have to be
reloaded and the content is streamed continuously, the application has the look
and
feel of single-page applications.
The usage of the WebSocket Element is as simple as importing the necessary
JavaScript files and defining the element somewhere on the HTML page. As soon
as the
page is loaded by the browser the WebSocket Element connects to the WebSocket
server, handles the subscription process with the server and initiates the element
based on the given configuration.
Imperative and declarative approach
Let us constrast the simple declarative use of the WebSocket Element to the
imperative appraoch that we have used previously [C17] with one of our modified versions of the BaseX
server to implement server-push with channels. The JavaScript code in Figure 7 demonstrates that we had to instantiate a
custom endpoint object and parameterize it with callback functions that map to
WebSocket events onMessage, onClose and onOpen. The endpoint object, when started,
using the callback function configured for onOpen, would call the WebSocket API
of
the browser behind the scenes to open a connection to the modified BaseX server.
The
callback function would need to create a JSON object used as a subscribe frame
to
tell the server which channel to subscribe to.
The most obvious difference to our declarative WebSocket Element approach is the
higher complexity in terms of length and code. While this code may not be hard
to
understand for more experienced developers it does present an entry barrier to
web
development for domain experts and people without any significant coding experience
who could use their domain expertise to build web applications with XML [B16]. No knowledge of JavaScript callback functions,
variables or loops is needed and necessary security checks to ensure correctly
formed attributes are done in the background using the WebSocket Element. The
imperative solution becomes even more complicated when multiple WebSocket
connections to different destinations must be opened, as this requires duplicate
and
more complex code. The WebSocket Element can be declared and configured multiple
times on the same page like any other HTML element. Like other HTML elements it
conveys meaning in the tag itself and encapsulates its functionality which could
be
extended easily in the future by adding more attributes. The modularity of the
declarative approach makes it flexible yet easy to use as many attributes are
optional.
Advanced functionality
Our basic WebSocket Element contains only the mandatory attributes to establish a
WebSocket connection. There are many additional parameters which can be used to
extend the functionality as needed. These include settings about automatic
reconnection, ping frequency, client side XSLT support and initial data to load.
In
the example above the data received from the server will not be altered, only
inserted into the page for rendering as content for the WebSocket Element. This
is
suitable for HTML with CSS, SVG or other pre-generated formats supported by the
browsers. The following code shows a WebSocket Element which receives raw XML
data
from the server which is then transformed by the browser given an XSL stylesheet
and
parameters. Additionally the geturl attribute specifies a relative URL from which
the first state of the element should be fetched. One important goal was to make
the
element easy to use but extendable for advanced tasks.
Architecture
The figure below demonstrates the interactions between the different components in
a multi-client web application using the WebSocket Element and a WebSocket enabled
server in an MVC architectural style. In a multi-client application many different
views and WebSocket Element instances can exist. For the sake of simplicity, the
figure shows the communication from only one client's perspective. From the GUI
perspective, WebSocket Elements are part of the View and incoming messages will
update the display according to how the View is realized. In the figure, the
WebSocket Element and the View are shown as separate components to illustrate
the
interaction.
This architecture extends our reference architechture as shown in Figure 1. A View communicates with the web server
through HTTP but is not using the HTTP response. Instead, the web server triggers
a
method in the Controller through RestXQ. The Controller defines through the
WebSocket module a response that it wants to be sent to all subscribing WebSocket
Element instances. The web server follows through and the WebSocket Element inserts
and potentially processes the response by transforming it and updates the view.
The
bidirectional channel between the WebSocket Element and the web server is used
for
ping and pong messages to ensure the liveness of the connection and to handle
the
subscription to channels.
Example application: Tic-Tac-Toe
We have implemented a two-player demo application for the game Tic-Tac-Toe. The
demo uses BaseX on the server side. In this case, the application core transforms
declarative state description into SVG and sends that to the individual clients.
A
version that passes the state description to the WebSocket Element in each of
the
clients and performs the XSL transformation into SVG on the client side is equally
possible.
Some important methods and concepts of using the WebSocket Element will be
highlighted in this chapter to show the practical usage in a multi-client web
application. The Tic-Tac-Toe game is implemented on the BaseX server (including
STOMP support). It uses XML technologies:
XQuery as the serverside programming language.
XSLT to generate HTML and SVG to render in the browser.
XML as data format to describe a Tic-Tac-Toe game.
Furthermore it implements the following functionality:
Playable by 2 players in a distributed way.
Only one instance of the game, so only one game server to play
on.
We will look at two methods which are important for a multi-client application and
show how the WebSocket Element is created and the communication channels
established.
Before players can play a game together they first need to join the game.
As seen in Figure 10 the join method is using RESTXQ and
awaits a POST request on the specified path. Its main purpose is to build a HTML
page for the calling client and create a WebSocket Element with the specified
playerID. The method uses BaseX's request module to get the hostname and port
component of the incoming HTTP request. It then constructs the WebSocket path
to
which the WebSocket Element will later connect, the URL where the first state
should
be retrieved from (getURL) and the subscription attribute for the WebSocket Element.
By using the request module no parameters like "localhost" or the port have to
be
hard coded and the game can be played in different network configurations not
only
on localhost.
The join method then proceeds to create the HTML page which contains all necessary
dependencies (JQuery and STOMP) and the JavaScript file for the WebSocket Element
itself. Inside the body of the HTML our newly defined WebSocket Element is defined
and configured using parameters. After that the method returns the HTML to the
client, the browser starts parsing the site and connects via WebSocket to the
URL we
specified in our join method. Furthermore, the subscription attribute is evaluated
to join for example the channel "/ttt/X". The client will be reachable through
the
channel "/ttt/X" while another client can join channel "/ttt/O". Finally the
WebSocket Element will issue a GET request to our getURL, which in the case above
is
our draw method. This will trigger the server to push the current state to all
clients. We have now successfully established a WebSocket connection with the
BaseX
server by calling our join method.
As we now have clients connected through WebSocket Elements to our BaseX server,
we can now proceed to send messages to them, to inform them of the state of the
game. This is accomplished using a generic draw
method, shown in Figure 11. The main purpose of the method is to
send the current game state through WebSocket to all connected clients.
The drawGame method doesn't change the state of our game and is annotated using
RestXQ's GET, awaiting requests to the specified path. The game is described by
a
XML model which is transformed into HTML and SVG by the stylesheet "ttt_game.xsl".
The method gets the stylesheet and the wsID's of all currently connected clients.
The getIDs() method uses BaseX's ids() method from the WebSocket module. Inside
the
return the method iterates through all connected WebSocket clients, gets their
respective ids and generates for each of them a visual presentation of the game
using the before mentioned stylesheet. In a last step drawGame uses the send method,
which sends the transformedGame to the clients using the sendChannel($data, $path)
method introduced with BaseX's STOMP server.
Inside the browser on the client side the WebSocket Element receives the HTML and
SVG sent by the drawGame method and updates it's view accordingly, by merely
inserting it into it's own content. The clients can now issue an action in the
game,
which ultimately triggers the drawGame method to propagate the change of state
to
all connected clients. Figure 12 shows the user interfaces of two
clients playing against each other.
The two functions showed some important aspects while working with the WebSocket
Element and BaseX. It showed that building a multi-client application within the
XStack doesn't result in complicated long code and only some additional methods
have
to be used to handle the WebSocket connection. However, many functions used by
a
locally playable Tic-Tac-Toe can be reused without further modifications in a
multi-client application.
Final remarks
WebSocket is a widely supported technology which all major browsers implement. The
custom element (V1) specification is fully supported by Chrome and Firefox, whereas
Safari and Opera only implement the special case on which the WebSocket Element
is
built [CIU19]. Microsoft Edge does not
implement custom elements yet, but support for this often requested feature is
marked as "in development" [M19].
The WebSocket Element works in conjunction with HTML, CSS and SVG on the client
side. We are currently investigating how the WebSocket Element can be integrated
into other GUI technologies such as XForms and Saxon JS.
The WebSocket Element simplifies the development of multi-client web applications
in the X stack, as shown by the proof of concept Tic-Tac-Toe, which is implemented
by using BaseX's STOMP WebSocket implementation on the server side and the WebSocket
Element on the client side.
Statecharts and SCXML
GUI components are commonly considered to be event-driven systems whose functionality
is triggered by events, mostly from user interactions. Typically, due to the nature
of
those systems, there are constraints to the legal sequences of events, and the
specific
activity that is triggered may be dependent only on a specific pattern in the history
of
previous events. The classical tool to model such abstract “behaviour” of an
event-driven system is statecharts. Statecharts have been first introduced by Harel
as
documented in a book [HP98] he co-authored with
Politi. They have later, in the object-oriented variant of state diagrams, become
part
of UML2; see [SSHK15] for a textbook introduction
and [H99] for an extensive discussion of the
use of statecharts in software engineering and how they help to cut the complexity
of
models and of model-driven implementations.
Most recently, with SCXML (State Chart XML) [B15], an
XML encoding language for statecharts has been standardized, bringing statecharts
into
the realm of XML technologies. SCXML fully supports the semantics of statecharts
defined
by Harel and furthermore, specifies additional elements, for example for communication
to external systems or for execution of specific activities. A number of research
papers
discuss the use of SCXML in particular, among them the Bachelor Thesis of
Roxendal [R10], invited expert to the W3C
committee that defined SCXML.
The introduction of SCXML has led to a need and rise of systems that are able to
interpret or run an SCXML-encoded statechart that models the behaviour of a system,
calling system activities when changing state as defined by the statechart and
therefore
making SCXML executable in a system. Such SCXML processors directly execute models
for
application behaviour, interfacing with application activities. Grubmüller [G18] discusses a number of such SCXML processors
which mainly differ in their programming languages as well as in their functionality
of
supporting the standardized semantics of SCXML.
In a web application that is modeled in the MVC architectural style, the Model
component is another event-driven system that may have dynamically instantiated
sub-components that are again also event-driven. In a game, for example, we might
have a
single lounge in which players assemble for games and a sub-component for each
game that
is currently active. Events are API calls for the Model in the form of function
calls or
HTTP requests that are typically issued by the Controller component. In the X stack,
the
Model component is implemented as an XQuery module that runs, for example, in the
BaseX
database system [B16]. That module needs to be able
to instantiate SCXML processors for SCXML-encoded statecharts at runtime, to forward
events to these SCXML processors and eventually to delete the SCXML processors.
Obviously, the module also supports an API to handle activities that are triggered
by
the statecharts.
In a previous study [ABCES17], we have investigated the
possibility of using an SCXML processor that is implemented in XQuery [S15,E17] to support the
implementation of a Model component in a web application. Whereas it is attractive
to
use an SCXML processor that is implemented in XQuery in the X stack, the limitations
in
functionality of current XQuery implementations have led us to a different approach,
namely to integrate the Apache Commons SCXML Interpreter [A17] into the X stack [G18]. Apache Commons SCXML Interpreter is a stable,
functionally complete and well documented SCXML interpreter that is implemented
in Java
as a project of the Apache Software Foundation.
How can the Apache Commons SCXML Interpreter, which is implemented in Java, be
connected to a Model component that is implemented as an XQuery module and runs
in the
BaseX database system? Grubmüller [G18]
provides a solution that is based on Java bindings as offered by BaseX
to make Java classes available to XQuery
modules. The solution is illustrated in the figure below.
Therefore, a new XQuery module handles the communication between the XQuery module
of
the Model component and the SCXML interpreters, which are Java objects. The particulars
of Java bindings in BaseX require a second communication module written in Java
that
manages the different SCXML interpreters that are active at any time such that
every
instance of the SCXML interpreter maintains its own state, for example, the state
of one
specific game instance. The current state as well as the next
possible events out of this state of every interpreter are additionally saved in
XML-format to retrieve this information by keeping Java Binding calls as minimal
as
possible.
Two case studies with different levels of complexity in terms of system behaviour,
Tic-Tac-Toe and Blackjack, demonstrate the validity of the approach. In both cases,
the
declarative modeling of the behaviour of the Model component as statecharts, the
representation of the behaviour models as SCXML documents and their interpretation
with
the Apache Commons SCXML interpreter facilitate a model-driven approach which also
highlights the added value of statecharts. First of all, using statecharts for
describing and modeling the behaviour of a system fully captures how the system
should
behave under certain conditions and events in a standardized manner. This creates
a
clear picture for everyone dealing with the system where no room for misunderstanding
is
left. This can be demonstrated when modeling the behaviour of the game
Blackjack [G18] which is shown in the
figure below.
As the game Blackjack has non-trivial behaviour (there are several actions players
can
take under certain conditions), statecharts are the appropriate tool for modeling
as
they provide a rich set of features like higher order states that allow the creation
of
a logical hierarchy of states. One example for this are the two states
“GAME_RUNNING” and “GAME_OVER” which are obviously on the same
level of abstraction while “GAME_RUNNING” consists of several lower order
states describing when the game Blackjack is currently being played. Another benefit
of
this approach is that all necessary main functions which are needed to implement
the
system are kind of predefined within the statechart in the form of events that
connect
the states. This improves the understanding of the system and allows for a more
structured way to implement it.
SCXML supports all the semantics introduced by statecharts and thus, the encoding
to
SCXML is straightforward. Furthermore, the implementation of Apache Commons SCXML
allows
to call Java functions within state transitions out of an SCXML [A17]. This is important as every interpreter instance has to be
able to call the corresponding application functions which are located in an XQuery
module in our case. This is realized by sending HTTP requests from a custom Java
function to the function in the XQuery module by using RestXQ annotations [G18].This approach also allows to send any number
of HTTP requests within one state transition meaning that several XQuery functions
can
be called independently. As a result the SCXML interpreter fully controls how the
system
behaves and which functions are called when a certain event occurs and states are
changed.
This model-driven approach achieves a strict separation between behaviour and
implementation of the system logic as the behavioural component is completely expressed
within the statechart and the encoded SCXML. Through this separation the implementation
of the system logic gets much clearer and more compact as the behaviour is controlled
separately. This effect was observed even with a comparatively simple system like
TicTacToe but gets even more impressive and useful for more complex systems.
We currently investigate how to transfer the work that was done for Model components
that run in an XML database system on the server to GUI components that run in
a web
browser. The goal, again, is a declarative approach that generates code from models.
We
intend to model the behaviour of GUI components with statecharts, encode the statecharts
with SCXML and have them executed by SCXML interpreters. Since there are a number
of
promising SCXML processors written in JavaScript [G18], the language of web browsers, we expect the
integration of SCXML processors into the client side of the X stack to be
straightforward.
Conclusion and future work
In this paper, we have extended our coherent and coordinated set of practices for
developing XML-powered web applications from models by taken a closer look at
technologies for graphical user interfaces. The practices draw on previous work
and have
been and are being vetted with case studies. As always, proven principles and practices
from software engineering have been a source of inspiration.
We have presented a number of GUI technologies that are useful for our purposes. They
have different strengths and weaknesses, so we still need to come up with a framework
to
mix and match these technologies.
We have looked at GUI technologies in browsers in the context of the trusted
MVC architectural style, which allows to decouple the user interface from the
other components of an application. We have defined requirements for GUI
technologies and we have investigated and evaluated a number of specific
technologies.
A number of GUI technologies are still under consideration, most notably
SaxonJS and React. We are also interested in ways to deal with time in GUIs and
for enriching GUIs with post-WIMP interaction methods and computational
capabilities.
Previously, we have shown how RestXQ annotations of XQuery functions enable us
to rely on pure HTTP communication between clients and servers (no frameworks!)
We have extended this declarative approach to communication over the STOMP and
WebSocket protocols for server push, introducing a newly developed HTML
component called WebSocket Element [U18]
that initiates communication to a server and handles the incoming XML data. This
is a declarative way to integrate AJAX-like calls into a web page with the added
feature of allowing for incoming data from servers that are not preceded
one-on-one by requests from the client. The BaseX server is now enabled for
server-push communication through RestXQ-like annotations and a new WebSocket
module. This work is was inspired by previous theses at TUM [C17,U18]
The complexity of event-based systems such as model or user interface
components can be reduced by introducing the concept of behaviour that is
modeled by statecharts. We have demonstrated how SCXML-encoded statecharts can
be instantiated dynamically and executed with XML technology in a model
component by interfacing to the fully functional SCXML processor of the Apache
Commons SCXML project [G18]. Further
work will look into JavaScript SCXML processors that can be integrated into a
GUI component in a browser.
All solutions are based on W3C or industry standards and use freely available software
components.
The main motivation for this work is to be able to generate serious games as web
applications from models. Systematic analysis of user interactions with these games
are
used to determine variants of games in an adaptive fashion, to improve usability
and to
further learning.
Another motivation for this line of work has been to support XML experts as end-user
programmers
In teaching a lab course on XML technology, we have continued to make the experience
that no-frills web applications, reduced to essentials, which do not require any
frameworks, are a useful and appreciated pedagogical approach for teaching computer
science students. Students who only use X stack technologies for web application
projects develop practical skills in SVG, XQuery and XSLT. Our austerity requirement proves to be an effective pedagogical tool for
raising the level of conceptual knowledge, appreciation and practical competence
in the
area of XML technologies. This is the not-so-hidden agenda in the lab. We consider
this
outcome more valuable than instructing students in another short-lived web application
framework.
[B15] Jim Barnett (Editor-in-Chief). State
Chart XML (SCXML): State Machine Notation for Control Abstraction. W3C
Recommendation 1 September 2015. [online]. [cited 11 April 2016].
http://www.w3.org/TR/2015/REC-scxml-20150901/.
[BD09] Bernd Brügge; Allen Dutoit.
Object-Oriented Software Engineering Using UML, Patterns, and Java.
Prentice Hall, 2009.
[BRHS12] Anne Brüggemann-Klein; Jose Tomas
Robles Hahn; Marouane Sayih. Leveraging XML Technology for Web
Applications. In Proceedings of Balisage: The Markup
Conference 2012. Balisage Series on Markup Technologies, vol. 8
(2012). [online]. [cited 22 July 2017]. doi:https://doi.org/10.4242/BalisageVol8.Bruggemann-Klein01. Updated version available on request from
brueggemann-klein@tum.de.
[BSW00] Jan Bosch; Clemens
Szyperski; Wolfgang Weck. Component-Oriented Programming. In European Conference on Software and Data Technologies.
Springer, 2000.
[C17] Michael Conrads.
Multi-Client Web Applications with XML Technologies. Master Thesis
Technical University of Munich, 2017.
[E04] Eric Evans. Domain-Driven Design:
Tackling Complexity in the Heart of Software. Addison-Wesley,
2004.
[E13] Jens Erat. Fine Granular
Locking in XML Databases. Bachelor Thesis University of Konstanz,
2013.
[E17] Andreas Eichner. SCXML in
Web-Based Applications. Master Thesis Technical University of Munich,
2017.
[F00] Roy Thomas Fielding.
Architectural Styles and the Design of Network-based Software
Architectures. PhD Thesis University of California, Irvine
2000.
[F02] Martin Fowler. Patterns of
Enterprise Application Architecture. Addison-Wesley, 2002.
[F18] Johannes Finckh.
Erweiterung der Client-Kommunikation in BaseX um die Funktionalität von
WebSockets. Master Thesis University of Konstanz, 2018.
[SSHK15] Martina Seidl; Marion Scholz;
Christian Huemer; Gerti Kappel. UML@Classroom. An Introduction to Object-Oriented
Modeling. Springer-Verlag, 2015.
[U18] Philipp Ulrich.
Model-Driven Development of Multi-Client Web Applications with XML
Technology. Bachelor Thesis Technical University of Munich,
2018
Jim Barnett (Editor-in-Chief). State
Chart XML (SCXML): State Machine Notation for Control Abstraction. W3C
Recommendation 1 September 2015. [online]. [cited 11 April 2016].
http://www.w3.org/TR/2015/REC-scxml-20150901/.
Anne Brüggemann-Klein; Jose Tomas
Robles Hahn; Marouane Sayih. Leveraging XML Technology for Web
Applications. In Proceedings of Balisage: The Markup
Conference 2012. Balisage Series on Markup Technologies, vol. 8
(2012). [online]. [cited 22 July 2017]. doi:https://doi.org/10.4242/BalisageVol8.Bruggemann-Klein01. Updated version available on request from
brueggemann-klein@tum.de.
Jonathan Robie et al. (Editors).
XQuery Update Facility 1.0. W3C Recommendation 17 March 2011.
[online]. [cited 22 July 2017].
https://www.w3.org/TR/xquery-update-10/.