How to cite this paper
Durand, Jacques, Stephen Green, Serm Kulvatunyou and Tom Rutt. “Test Assertions on steroids for XML artifacts.” Presented at Balisage: The Markup Conference 2009, Montréal, Canada, August 11 - 14, 2009. In Proceedings of Balisage: The Markup Conference 2009. Balisage Series on Markup Technologies, vol. 3 (2009). https://doi.org/10.4242/BalisageVol3.Durand01.
Balisage: The Markup Conference 2009
August 11 - 14, 2009
Balisage Paper: Test Assertions on steroids for XML artifacts
Jacques Durand
Senior architect, R and D dir.
Fujitsu America, Inc.
Jacques Durand is software architect at Fujitsu America, Inc. with a long-time involvement
in XML standard organizations, member of the OASIS Technical Advisory Board, contributor
to XML user consortiums such as RosettaNet, OAGI. He has extensive experience in XML-related
testing, is chair of the Test Assertions Guideline OASIS committee and of the Testing
and Monitoring of Internet Exchanges (TaMIE) committee. He has been leading testing
activities for years in the WS-Interoperability consortium and in the ebXML technical
committee. He earned a Ph.D. in rule-based systems and logic-programming from Nancy
Univ., France.
Stephen Green
Associate Director
Document Engineering Services
Stephen Green is an Associate Director of Document Engineering Services, an international
consortium of experts supporting universal business interoperability through the use
of open standards. His expertise is in finance, business documents and software development
for business and financial applications. He has specialized in legacy systems and
modern electronic business trends and their impact on small and medium sized enterprises.
Stephen has been active in the Organization for the Advancement of Structured Information
Standards (OASIS) for seven years, serving on as many technical committees.
He is currently editing the Test Assertions Guidelines of the OASIS technical committee
of that name. He previously led the first efforts to provide a small business subset
conformance profile for the OASIS Universal Business Language, version 1.0."
Serm Kulvatunyou
Standard and Product Architect
Oracle
Serm Kulvatunyou is currently a Standard and Product Architect at the Oracle's Application
Integration Architecture (AIA) division. Formerly, he was a guest researcher the at
the Manufacturing Systems Integration Division, National Institute of Standards and
Technology (NIST) from the Oak Ridge National Laboratory. At NIST, he has designed
and implemented semantics testing and frameworks for design of document model and
instance validation in the contenxt of an e-business testbed using XML and related
technologies. He has been an active participants in several standard bodies such
as UN/CEFACT and OASIS. His current interests are in architecture and best practices
methodology to enterprise data model for reusable and interoperable Service-Oriented
Architecture. He received his Ph.D. in Industrial Engineering from the Pennsylvania
State University, University Park, in 2001.
Tom Rutt
Standards Manager
Fujitsu America, Inc.
Tom Rutt is Standards Manager at Fujitsu America, Inc. with a long-time involvement
in XML standard organizations and participates in several Web Services standard committees.
He has extensive experience in XML-related testing, and has been involved in the WS-Interoperability
consortium for years, more recently designing and developing testing tools for profile
conformance. He is also member of the OMG Architecture Board.
Copyright © Fujitsu America, Inc., Document Engineering Services, Oracle: Used by
permission.
Abstract
Testing of XML material – either XML-native business documents, or XML-formatted inputs
from various sources – involves more than syntactic or semantic validation of a document.
It often requires checking consistency with other documents, and verifying assumptions
about the quality of these. Consequently - like for any complex system - the design
and execution of test units have to be composed and ordered. This in turns requires
a testing method and tool with more flexibility - in test expression and test usage
- than provided by validation tools such as OWL reasoner or Schematron. A test method
is presented that relies on a general test assertion model from OASIS. This test model
(and its XML markup) is extended with XPath in order to make test assertions directly
executable after XSLT translation, by a forward-chaining engine itself written in
XSLT. Test assertions may refer to other test assertions either for chaining or for
composing test results. The resulting test model and processing is contrasted with
other approaches (XBRL test suite, OWL reasoner, Schematron). Results and learnings
from a real test suite are presented, as well as a proposed implementation model based
on generating the XSLT engine specific to a test suite, rather than using a generic
engine. Observations are made about features in the latest versions of underlying
technologies (XPath2.0, XSLT2.0) that were critical to this implementation.
Table of Contents
- Introduction
- XML Test Assertions for XML
-
- Test Assertion Model
- The Test Target and its Context
- Reporting Test Outcomes
- Inheritance and Composition
- Chained Test Assertions
- Execution Semantics
- Implementation Considerations
- Other Works in Semantic Validation of Documents
-
- Schematron
- OWL
- XBRL
- Conclusion
Introduction
Testing of XML material – either XML-native business documents, or various XML-formatted
inputs – often involves more than single document validation. The notion of validation
may depend on a context involving other documents. Validation against a schema is
just an an example of this. Behind XML document testing it is often a processor generating
or editing this document, that is being tested. This is the case with XBRL test suites
(for XBRL processors), with WS-I test suites (testing Web service instances) and also
the ODF test suite (targeting document processors). As a consequence, the notion of
validity depends on a context made of diverse other documents, that represent various
inputs to these processors as well as traces that capture their behavior and relate
all documents associated with the same test case. Roughly, two categories of documents
can make up such context: (a) metadata documents, and (b) scenario documents.
-
Metadata documents may involve various business rules, reference documents and templates,
contractual documents, configuration artifacts. XML schemas are just an example of
these, and they may be involved in quite diverse validation patterns [1] beyond conventional
schema-validation of instances.
-
Scenario documents reflect on operations over a system under test - such as an XML
transcript of an electronic exchange with the system under test, a log converted into
XML, or the script a sequence of operations to be performed by a test driver.
In such cases, testing is as much about verifying that each one of these XML artifacts
is individually correct, as it is about verifying that some combinations of these
are consistent (e.g. a Web service message must conform to its definition in WSDL,
or the output of an ODF processor is consistent with the operation performed and the
previous state of the document). In some cases it is not even a main document that
is under test relative to some context, but rather a sequence of documents, e.g. a
message choreography for a business transaction combining business payloads, message
protocols and service interfaces that is tested for conformance [2]. The dependency
between a document and its transactional context (exchange protocol) has also been
analyzed in [3].
Such testing requirements are in fact closer to conventional system or software testing
requirements than to document testing in a narrow sense - while also requiring same
XML testing capabilities as known today for single documents. Because each type of
artifact may have its own validation rules and test suites, tests must be grouped
into modules, the execution of which is conditioned by the results of other test modules.
Chaining of test cases becomes an important feature, across modules or within modules.
This diversity of these test requirements poses a challenge to a test environment:
-
Rules and constraints (Schematron, OWL Reasoning, RuleML) are often limited in either
one of two ways: (a) their expressive power is often traded for ease of processing,
(b) their decision model (e.g. predicate logic) often enforces a Boolean outcome missing
the nuances expected in a test report.
-
Test suites - and test engines – often exceed the scope of dedicated tools such as
Schematron (e.g. XBRL test suite). As a result they are architected and developed
in an ad-hoc manner, regardless of how well they leverage XML technologies.
This paper describes a more integrated XML testing paradigm which supports flexible
composition of test cases (chaining, parameterization) and test suites (modules, reuse).
The resulting implementation makes the best of XPath2.0 and XSLT2.0 to provide on
one hand a test script model able to express predicates crossing over such diverse
inputs and to handle a richer spectrum of outcomes, and on the other hand a test engine
able to compose and chain test assertions in a way that was usually considered requiring
specialized rule engines written in conventional programming languages or dedicated
AI languages such as Prolog or LISP.
XML Test Assertions for XML
In this section the authors argue that the best approach for testing XML material
– given the integrated aspect of such testing - is one that builds on conventional
test methodologies, augmented with a proper integration of XML-processing techniques
(here XPath2.0, XSLT2.0).
Test Assertion Model
Test assertions (TA) is a familiar concept for QA engineers and test developers.
A test assertion is a testable or measurable statement for evaluating the adherence
of part of an implementation to a normative statement in a specification. Test assertions
provide a link between the narrative of a specification (i.e. rules, schema, requirements,
system definition) and the test suites that assess conformance of implementations.
Test assertions have been mostly used in the domain of software engineering, and less
often in more specialized domains such as XML artifacts, where ad-hoc solutions -
and also very specialized tools - have flourished instead. Test assertions are usually
declarative (logical) statements that are written as a blueprint for test cases, the
latter being the actual executable tests.
A major benefit in writing test assertions, is that they represent a "conformance
contract" understandable by all parties involved - domain experts, test writers, end-users.
An additional interest in the XML space - where all material under test is in XML
format, is that test assertions can be directly scripted using such dialects as XPath
or XQuery, thus becoming themselves executable test cases. A general-purpose model
for test assertions has recently been developed in the OASIS Test Assertions Guidelines
(TAG) OASIS committee [8]. In this model, a test assertion (TA) is a well-structured
object defined as follows:
-
TA Id: the identifier of the Test Assertion.
-
Source: the normative conformance requirement that this test assertion is addressing
-
Target: a Test Assertion always targets instances of a specific artifact type, for example,
a line item fragment in a purchase order document, a SOAP Envelope, a WSDL port binding,
etc. The Target element identifies this artifact type.
-
Prerequisite: a pre-condition that must be satisfied over the Target instance in order for this
instance to qualify for evaluation under this TA. The Prerequisite may refer to other
test assertions. If the Prerequisite evaluates to “false”, then the outcome of the
TA for this Target will be “notQualified” in the test report.
-
Predicate: a logical expression over the Target. The Predicate is only evaluated if the Target
instance is qualified, i.e. if the Prerequisite – if any – has already evaluated to
“true”. If the Predicate result is “true” then the Target instance fulfills the related
conformance requirement and violates it otherwise.
-
Prescription Level: a keyword reflecting how imperative it is to fulfill the (Source) requirement: mandatory
/ preferred / permitted.
The authors have profiled and extended this model so that test assertions become
directly executable over XML artifacts, thus becoming "test cases" grouped in test
suites.
The profiling consists of the following:
-
Use XPath expressions to define Target, Prerequisite and Predicate.
-
Define how the instances of a particular target type are identified. This is defined
by another XPath expression that returns a unique ID, possibly resulting from aggregation
of several fields relevant to this target type. This ID will show in the test report,
but also used when chaining test assertions over a same target instance during test
execution.
-
Add references (XPath) to a combination of artifacts that represent contextual documents,
over which Prerequisite and Predicate may operate.
-
Add a new Reporting element that determines the outcome of the test assertion over
a target instance.
-
Add secondary output mostly for human readers: error messages, diagnostic data.
The Test Target and its Context
A test assertion will always focus on a “primary” target instance, but may need to
access contextual material in order to test this target. This "side" material is identified
in "variables" added to the TA. A simple example of this is schema-validation of a
document. In the target scripting below, the test assertion will refer to the contextual
document (a schema) while targeting a purchase order line item:
<testAssertion id="1234" lg=”xpath20” >
<var name=”poschema” type="string">http://www.mysupplychain_xyz.com/2009/04/12/po.xsd</var>
<target>//xyz:purchaseOrder/xyz:lineItem</target>
<predicate>$target instance of schema-element($poschema, xyz:lineItem) </predicate>
...
</testAssertion>
The predicate validating a lineItem element will refer to this contextual document
using the conventional variable notation ($). The predicate expressions will be pre-processed
into executable XPath.
The above test assertion applies to every line item of any purchase order.
Variable expressions (<var>) may refer to any contextual material - either inside
the same document or external. An XPath variable notations($) may then be used either
to parameterize the location of a document, or to refer to the current value of the
target expression:
<testAssertion id="2345" lg=”xpath20”>
<var name=”herbooks” >document($allbooks)/book[@author = $target/name]</var>
<var name=”herpublishers” >document($allpublishers)
//directory/publisher[fn:index-of( fn:distinct-values(
‘for $bk in $herbooks return $bk/@publisher’), @name) gt 0 ]</var>
<target>//whoswho/arts[@section='literature']/biographies/author</target>
...
</testAssertion>
In the above, "$allbooks" and "$allpublishers" are references to a documents that
have been defined outside the test assertion. The variable "$herbooks" denotes the
subset of books from this author. The variable "$herpublishers" is the subset of publishers
this author has been dealing with. The target expression is matched against a third
document, the main input.
A predicate for the above target may express a condition over the author (the target),
her related list of books ($herbooks) and of publishers ($herpublishers).
Reporting Test Outcomes
The additional Reporting element added to the test assertion structure may override
the default outcome which is:
-
notQualified (if Prerequisite = “false”)
-
pass (if Prerequisite = “true” and Predicate = “true”)
-
fail (if Prerequisite = “true” and Predicate = “false”)
Indeed, other possible outcomes are:
-
missingInput (a contextual document or XML fragment is missing in order to pursue the evaluation)
-
warning (the Predicate may not be indicative enough of either violation or fulfillment, but
has detected a situation calling for further attention.)
-
undetermined (e.g. the Predicate is only designed to detect some kinds of violation, when “false”,
and has no particular conformance meaning when “true”)
These outcomes are not only intended for a final test report. They can also be tested
and influence the test suite execution if the test assertion that produces them is
referred to in predicates and prerequisites of subsequent test assertions.
Inheritance and Composition
An important benefit of clearly identifying target categories, is the ability to
leverage inheritance and composition relationships between targets. Targets often
belong to a class system, in the object-oriented sense (a target may be part of an
other target, may be a subclass of another target, etc.). This leads to an enhanced
test execution model that is able to leverage such relationships. In particular:
-
Inheritance: The test engine is able to determine that a test assertion will apply not only to
all instances of its Target class, but also to all instances of its Target sub-classes.
In other words, a target inherits the test assertions of its super-classes.
-
Composition: The test engine is able to handle cases where the prerequisite expression of TA
t1 includes references to TA t2, and yet t1 and t2 do not have the same Target class
- not even in a sub-class relationship - but have a component relationship. For example,
t1 target - say a "binding definition" - is a part of t2 target - say a WSDL file.
Before verifying that the binding satisfies some rules (TA t1) one may want to verify
that the embedding WSDL file is schema-valid (TA t2). In such cases, the component
relationship can be defined once as an access expression from t1 to t2 on the ancestor
axis (XPath), reusable by any TA.
The authors are in favor of supporting two modes of representation for such a model:
(1) "inline" relationship information can be embedded in each test assertion as needed.
(2) a different mark-up separate from test assertions will hold target model information.
While (2) is a more rational and scalable approach (avoid redundant information from
on test assertion to the other, etc.), (1) is a convenient approach well suited for
the test assertion development phase.
Examples of inline model information are shown below. The target element uses a qualification
notation to indicate the super-class (message) of SOAPmessage target class.
<target type="message:SOAPmessage" > ... </target>
The composition link between the main target of a test assertion - here a WSDL binding
- and the related embedding target of a prerequisite reference - here the WSDL file
itself - is indicated using an XPath expression relative to the selected target node
($target) as "argument" of the test assertion reference (tag:BP2703):
<testAssertion id="BP2403">
<target type="binding" idscheme="..." >//wsdl:definitions/wsdl:binding</target>
<prerequisite> tag:BP2703($target/..) = 'pass' </prerequisite>
...
</testAssertion>
Another major attribute of a Target class, is the ID scheme itself an XPath expression
that will return a unique ID string for each target instance, to appear in the test
report.
Chained Test Assertions
A powerful aspect of the otherwise simple TAG model, is that a test assertion (TA)
may refer to other test assertions. It may do so in two ways:
-
by using TA references in the Prerequisite element. Such references are just parts
of the logical expression, e.g. (in a simplified notation) :
Prerequisite of TA3: (TA1 = “pass”) and (TA2 = “pass”)
means that it is expected that the target passed TA1 and TA2 before even being
tested for TA3. TA-referencing in a Prerequisite is commonly used when the test expression
(Predicate) in a TA can be greatly simplified by assuming that the target already
passed other test assertions, or simply when the test itself is irrelevant in case
some other (prerequisite) tests have failed.
-
by using TA references in the Predicate element. This allows for writing meta-level
test assertions that evaluate a composition of the results of other TAs. This is often
needed when defining various "conformance profiles" related to the same type of document
(e.g. a category of insurance claims, a purchase order of class 'urgent'). For example:
“to comply with conformance profile P, a document must “pass” the set of test assertions
{TA1, TA2, TA3} and at least must NOT “fail” the set of test assertions {TA4, TA5}.”
In such a case, a single TA will summarize the composition test to be made over the
results of all TAs involved in assessing the conformance profile P. The predicate
will be:
Predicate of TA6: (TA1 = “pass”) and (TA2 = “pass”) and (TA3 = “pass”)
and not(TA4 = “fail”) and not(TA5 = “fail”)
This summary TA (TA6) may in turn be referred to from the Prerequisite of another
TA. This is an essential feature when dealing with contextual documents: in most cases,
one must first ensure that the contextual document is itself “conforming” before using
it in a test case over the “main” document. These expressions are pre-processed by
the test engine into equivalent XPath boolean expressions.
There are two main reasons for chaining test assertions as described in (a) above:
(a1) When a set of tests should logically be done in a particular order, meaning that
every single test should only be executed if the target instance passed the previous
tests. For example, the following sequence of tests is expected to be done in this
order, regarding a Web service definition:
-
Normative statement: "The wsdl:definitions MUST be a well-formed XML 1.0 document.
The wsdl:definitions namespace MUST have value: http://schemas.xmlsoap.org/wsdl/."
This is verified by the test assertion BP2703 below.
<testAssertion id="BP2703" lg=”xpath20” >
<target>//wsil:descriptionFile[fn:prefix-from-QName
(fn:node-name(*:definitions)) eq 'wsdl' or wsdl:definitions]</target>
<predicate>$target instance of schema-element($wsdlschema) </predicate>
...
</testAssertion>
-
Normative statement: "The wsdl:binding element MUST have a wsoap12:binding child element."
This is verified by the test assertion BP2402 below.
<testAssertion id="BP2402" lg=”xpath20” >
<target>//wsil:descriptionFile/*:definitions/wsdl:binding</target>
<prerequisite>tag:BP2703($target/../..) eq 'pass'</prerequisite>
<predicate>child::wsoap12:binding</predicate>
...
</testAssertion>
-
Normative statement: "The contained soap binding element MUST have a 'transport' attribute."
This is verified by the test assertion BP2403 below.
<testAssertion id="BP2403" lg=”xpath20” >
<target>//wsil:descriptionFile/*:definitions/wsdl:binding</target>
<prerequisite>tag:BP2402($target) eq 'pass'</prerequisite>
<predicate>not(wsoap12:binding[not(@transport)])</predicate>
...
</testAssertion>
-
Normative statement: "The 'transport' attribute - if any - of the soap binding element
MUST have value:
http://schemas.xmlsoap.org/soap/http." This is verified by the test assertion
BP2404 below.
<testAssertion id="BP2404" lg=”xpath20” >
<target>//wsil:descriptionFile/*:definitions/wsdl:binding[wsoap12:binding]
</target>
<prerequisite>tag:BP2403($target) eq 'pass'</prerequisite>
<predicate>not(wsoap12:binding[@transport ne
'http://schemas.xmlsoap.org/soap/http'])</predicate>
...
</testAssertion>
These four test assertions are chained via their prerequisite elements. This chaining
means that if one fails, the subsequent tests will not be performed: whatever their
outcome is, it might not make any sense or would at best produce an unnecessary distraction
in the test report.
(a2) In order to "reuse" (both at scripting time and at run-time) a complex expression
outcome that has already been handled by another TA. For example in the Web services
basic profile, a wsdl:binding must either be an rpc-literal binding or a document-literal
binding. The test for ensuring this is not a simple one:
<testAssertion id="BP2017" lg=”xpath20” >
<target>//wsil:descriptionFile/wsdl:definitions/wsdl:binding
[wsoap12:binding]</target>
<prerequisite>tag:BP2404($target) eq 'pass'</prerequisite>
<predicate>not(.//wsoap12:body/@use != 'literal')
and (count(.//wsoap12:body) = count(.//wsoap12:body/@use)) and
((not(.//wsoap12:*/@style != 'rpc') and
not(.//wsoap12:operation[not(@style) and not(../../wsoap12:binding/@style)]))
or (not(.//wsoap12:*/@style != 'document')))</predicate>
...
</testAssertion>
Several test assertions relate only to document-literal type. Once it is known
that a binding is of either above type, the test to distinguish an rpc-literal from
a document-literal is fairly simple. In the following test assertion that applies
only to document-literal type, this simple test - used here in the target expression
- guarantees that only document-literal bindings will be selected:
<testAssertion id="BP2111" lg=”xpath20” >
<target>//wsil:descriptionFile/*:definitions/wsdl:binding
[not(.//wsoap12:*[@style = 'rpc'])]</target>
<prerequisite>tag:BP2017($target) eq 'pass'</prerequisite>
<predicate>not(.//wsoap12:body[@parts and contains(@parts," ")])</predicate>
...
</testAssertion>
The other form of chaining (b) is done in the Predicate expression. This allows for
defining "meta-level" test assertions that wrap entire groups of test assertions by
summarizing their expected outcome in a single logical expression. Such a meta-level
test assertion can then be referred by other test assertions in their prerequisite
condition, when this entire group of tests must be passed.
For example, consider BP1214 listed in the next section. This test assertion targets
a SOAP message, but needs to access a contextual document: the interface binding definition
that governs the content of this message. Before executing BP1214, it is clear that
the binding definition must be verified. A meta-level test assertion can "summarize"
all the tests that ensure this correctness for rpc-literal bindings:
<testAssertion id="BP-rpc-bindings" lg=”xpath20” >
<target>//wsil:descriptionFile/*:definitions/wsdl:binding
[.//wsoap12:*[@style = 'rpc']]</target>
<prerequisite>tag:BP2017($target) eq 'pass'</prerequisite>
<predicate>(tag:BP2404($target) eq 'pass') and
(tag:BP2406($target) eq 'pass') and (tag:BP2020($target) eq 'pass') and
(tag:BP2120b($target) eq 'pass') and (tag:BP2117($target) eq 'pass') and
(tag:BP2118($target) eq 'pass')
</predicate>
...
</testAssertion>
The above test assertion may then be used as prerequisite for BP1214, over the binding
definition related to its message target, i.e. the "binding" element parent of the
"operation" element selected by the XPath expression in the variable $myOpBinding:
<testAssertion id="BP1214">
<var name="myOpBinding"> ...</var>
<prerequisite>tag:BP-rpc-bindings($myOpBinding/..) eq 'pass'</prerequisite>
<target type="message:SOAPmessage" >/wsil:testLog/wsil:messageLog/wsil:message[...]
</target>
...
</testAssertion>
Execution Semantics
The general mode of execution for above TAs, is of a conventional forward-chaining
rule engine. This is indeed necessary due to the chaining of test assertions, and
departs radically from how other XPath-based rules systems are processed (e.g. Schematron
or CAM do not handle chaining). The set of test assertions that have no references
to other TAs is executed first over all candidate targets. This first set of test
results for all possible target instances is recorded. In the next iteration of the
engine, only TAs that have all their references resolved over the first set of results
are executed in turn. Their results are added to the initial result set. Subsequent
iterations add to the previous result set until iterations cannot augment anymore
this result set which is considered stable. At this point, all possible validations
have been made over the material under test, and they are ready for an (html) test
report generation.
In the XPath-extended XML markup of the TAG model, test assertions can be conditionally
chained as rules to create dynamic test suites, the result of which can also be manipulated
by higher-level test assertions. This approach is addressing the need to integrate
validation of various XML artifacts, with validation of combinations of such artifacts
(consistency across documents). Indeed, this requires composing and orchestrating
test cases and test suites in a modular way.
Implementation Considerations
The automation of TAG methodology is leveraging both XLST2.0 and XPath2.0. However,
end-users (TA designers) only need to know about XPath to write test assertions. XSLT
is only used for the execution engine.
A two-phase processing of above test assertions has been implemented by the authors,
as often used when XSLT is the target execution language.
Phase 1:
-
input = set of test assertions (type: xml + XPath2.0)
-
processor = TAG engine generator (type: XSLT2.0)
-
output = test assertions engine for this set (type: XSLT2.0)
Phase 2:
-
input = documents under test and context (xml)
-
processor = test assertions engine in output of Phase 1 (type: XSLT2.0)
-
output = final test report (type: xml)
The first phase amounts to generating an XSLT-hardcoded test suite for a particular
set of test assertions. The second phase amounts to executing this test suite. In
addition to increased performance, the two-phase approach allows for advanced parameterization
features in test assertions at different levels, with the use of variables:
-
“Generation" (or "Phase 1") variables: these are given a value during Phase 1. Such
value assignments are hardcoded in the output of Phase 1. Examples are those identifying
contextual documents in a previous example (e.g. $allbooks, $allpublishers).
-
"Run-time" (or “Phase 2”) variables: these may have a different value at each execution
of the test assertion. Such variables are used to break down complex expressions for
Target, Prerequisite and Predicate, or to point at contextual documents that may vary
from one target to the other. In the example below, the Target is a SOAP message,
and the variable "myOpBinding" identifies the definition of the Web service operation
(here the WSDL file is in the same log as the message trace) associated with this
target instance (referred to using the pseudo variable "$target"):
<testAssertion id="BP1214">
<var name="myOpBinding">//wsil:descriptionFile/wsdl:definitions/wsdl:binding
[.//wsoap12:*[@style = 'rpc']]/wsdl:operation
[@name = fn:local-name-from-QName(node-name($target/soap12:Body/*[1]))]</var>
<target type="message:SOAPmessage" >/wsil:testLog/wsil:messageLog/wsil:message[...]
</target>
...
</testAssertion>
This test assertion verifies that the message is conforming to some aspect of its
binding definition.
Two of the authors have developed test suites for Web Services profiles developed
by the WS-Interoperability consortium (http://www.ws-i.org) using the XPath2.0-extended
TAG model. These test suites include test cases that involve a combination of documents
(WSDL, Schemas) and sequences of messages. About 250 test assertions were developed
for three WS profiles. The entire test suite execution process (Phase 1 + Phase 2
+ html rendering of the test report) is handled by stylesheets.
Prior to this one author had been leading a similar test tool development process
for WS-I based on conventional programming languages (Java, C#) [4]. The advantage
of the recent approach using XPath and XSL are:
-
Reliance on specialized XML dialects that have been developed over various platforms
and have been tested for consistency across these platforms, removes platform-dependency
of test tools (e.g. .NET, Java).
-
Visibility of the TA logic (test assertion definitions are currently embedded in the
WS-I Profile document and readable by end-users and developers who need to comply
with these profiles). In the previous approach, the logic of tests was buried in the
binaries of the test tools.
-
A modest but real gain in test suite design and overall development effort and the
related QA cycles.
Originally, the authors attempted to use XPath1.0 as expression language for TAs.
This was not sufficient to handle complex correlations of XML-fragment (either intra-document
or cross-document) required by WS-I test suites. XPath2.0 provided new features that
significantly enhanced the expressive power of target / prerequisite / predicate expressions,
such as quantified expressions, iterations and an extensive library of functions.
Advanced correlation patterns inside Predicate could be expressed in a declarative
way as a set of nested quantified expressions, making it possible to assign and reuse
variables at each level.
XSLT2.0 proved a suitable script language to implement a forward-chaining test assertion
engine thanks to features such as next-match(), while the authors have not been successful
at trying this with XSLT1.0.
Along the line of managing complexity by leveraging the composability of test assertions
and of their execution (chaining of test assertions, parameterization, meta-level
assertions), a future enhancement will allow a test assertion to define “byproducts”
, i.e. XML fragments produced by their execution (in addition to the main outcome
“pass”, “fail” etc.) that can be reused in Prerequisite or Predicate of referring
test assertions.
Other Works in Semantic Validation of Documents
Schematron
Schematron is a simple pattern and rule language well-focused on document
testing. It leverages XPath functions and expressions, and can be implemented using
XSLT. Rules in Schematron can be seen as serving similar purpose as test assertions.
However the authors, after initially attempting to develop WS-I test suites with Schematron
1.5, had to give up mostly due to its restricted rule execution semantics: there is
no support for "logical" rule chaining, and in a pattern only one rule - the first
that matches the context - will execute. This form of "if-then-else" chaining applies
to the context matching and not to the result of the rule itself, unlike what is expected
in conventional rule-based systems.
Schematron has been designed around the idea that the entity under test is
the document, while in our test assertion engine it is before all an XML fragment,
subset of some document. Each fragment (target) is systematically identified according
to a well-defined scheme for its target type. This identity - defined by an XPath
expression - is not only used for detailed diagnostic information in the test report,
but is central to the rule chaining mechanism of the test engine, i.e. for deciding
of the order of the tests on this target. Schematron allows for detailed and dynamic
diagnostic information that has the ability to fully identify a subset of a document,
but this does not play any role in the rule processing mechanism. Some valuable convenience
feature had to be added to test assertions in the form of variables, for handling
of complex Predicate expressions or to parameterize test assertions. Such variables
are also supported in Schematron 1.6. On the modeling side, Schematron introduces
a hierarchy of constructs (assertion, rule, pattern, phase). In contrast, the presented
approach is based on a flat model relying on a single construct - test assertion -
composable at different levels, but subject to the same execution semantics.
Although Schematron can be written against any XML document, it is primarily
intended for XML instances. Namespace handling becomes difficult when writing rules
against an XML schema, and when there is a namespace prefix in a value of an attribute
or element. Schemas often need be tested against naming and design rules [5]. Rules
are label specific, i.e., there is no inheritance. If there is a type hierarchy, separate
rules have to be written for every type even if the intention is the same. In Schematron
1.5 rules cannot be reused or combined programmatically, although ISO Schematron has
more provision for reusability such as abstract pattern and include statement.
In conclusion, although Schematron is well-positioned for document validation
through its lifecycle and is sufficient for many test cases, it has been designed
more in the spirit of validating content in a type-checking mode (an extended schema).
Its rules are intended for detecting patterns in documents and not to be executed
along a test suite processing model requiring tight control on which tests are executed
and when based on previous tests results. We believe the concept of test suite is
appropriate when considering a combination of diverse XML artifacts including XML-formatted
data of non XML source.
OWL
OWL by itself is not a rule or test language. It rather allows for declarative semantic
models. The handling and mapping of ontologies is an important aspect of validation
[6]. OWL reasoner can determine if there is a conflict based on the characteristics
of a particular object instance – in our context, a test target - based on a semantic
definition of class membership. OWL's “open world assumption” makes it difficult to
use it to validate documents. The open world assumption states that even if an object
instance has not declared one of its property, the reasoner cannot assume that it
does not exist. Typically, a programming routine needs to be written to “close the
world” by explicitly asserting that this object instance has no such property.
Because OWL is designed for acceptable computational time, its expressivity has been
limited. Even in OWL Full (the most expressive level of OWL language), the semantic
expressivity is limited to constraints around cardinalities and a few relationships
between class and properties such as transitive, inverse, and uniqueness. Expressing
arithmetic relationships between properties is virtually impossible. There are also
few reasoner that can perform OWL Full reasoning and more importantly datatype reasoning
(particularly user defined datatype necessary to validate against a range of values
or a code list). OWL reasoners typically process data in triple representation, which
is memory-greedy. Validating a few megabytes of document on a typical desktop with
one to two gigabytes of memory almost becomes impossible (industrial strength reasoning
engine and memory or storage management - such as Oracle RDF database - would be needed).
XBRL
The XBRL conformance test suite is worth consideration more as a typical use case
than a reusable tool. It is an example of validating a (set of) complex document(s)
with advanced semantics, that must comply with various rules in addition to schema
compliance. It also encompasses the testing of processors supposed to produce such
documents, illustrating how document-testing and processor-testing are intertwined.
The “minimal” conformance suite focuses on the document validation while “full” conformance
targets XBRL processors. Full conformance involves other documents than the main document
being processed, and relies on output documents (Post-Taxonomy Validation infosets)
that reflect the processing semantics. Minimal conformance generally contains at least
one test for each appearance in the specification of ‘MUST’ that are not already enforced
by XML Schema validation.
The structure of the test suite is based on the OASIS XSL Conformance Suite. The
structure of each individual test is simple:
Each test case is described by a "meta-level" XML file that refers to associated
test material. Some XML files are describing the expected outputs for each test case.
The overall test engine is an ad-hoc stylesheet that runs the tests. The pass / fail
decision for each case is based on the comparison of canonical forms of actual output
and expected output.
A first assessment allows to conclude that each test case could be scripted as a
test assertion. When specialized operations are needed (like file canonicalization
using infoset.xsl, or file comparison) these could be wrapped as xsl functions used
in the test assertion expressions. Assuming a two-phase testing process (running XBRL
processor, then validating results), the test assertion engine described here could
handle the validation phase, which relies on XML documents.
Conclusion
A general-purpose test methodology based on a formal notion of test assertion (originally
not intended exclusively for XML input) has proved adequate for the testing of XML
artifacts where contextual material of various kinds need be taken into account. When
extended and implemented with XML dialects such as XPath2.0 and XSLT2.0, this method
has proved more powerful for such XML inputs than dedicated test tools. The resulting
test model does not introduce a hierarchy of constructs, but uses a flexible notion
of test assertion as the main construct for expressing atomic test results as well
as for chaining and composing test units.
Another benefit of the proposed approach is to keep XSLT “under the hood” and not
make it part of the definition language of test cases. There is also no need to develop
an XSLT test program specific to a test suite. This contrasts with ad-hoc test suites
such as XBRL’s. With a robust test assertion model, only XPath needs be mastered by
test developers.
Future plans include standardization of the TAG mark-up and its XPath extension, along
with an open-source - style availability of the XSLT-based engine technology that
supports it.
References
[1] Holman, K., Green, S., Bosak, J., McGrath, T., Schlegel, S. ; Use of XPath to apply constraints to an XML Schema to produce a subset conformance
profile ; UBL 1.0 Small Business Subset; 2006 http://docs.oasis-open.org/ubl/cs-UBL-1.0-SBS-1.0/
[2] Durand, J., Kulvatunyou, S., Woo J.,and Martin, M. ; Testing and Monitoring E-Business using the Event-driven Test Scripting Language ; proceedings I-ESA (Interoperability of Enterprise Systems and Applications), April
2007
[3] Glushko, R., and McGrath, T. ; Analyzing and Designing Documents for Business Informatics and Web Services ; MIT Press, March 2008
[4] Durand, J. ; "Will Your SOA Systems Work in the Real World?” ; STAR-East, Software Testing Analysis and Review Conference, May 2007
[5] Lubell, J., Kulvatunyou, B., Morris, K.C., Harvey, B. ; A Tool Kit for Implementing XML Schema Naming and Design Rules ; Extreme Markup Languages Conference, August 2006, Montreal, Canada.
[6] Anicic, N. , Marjanovic, Z. , Ivezic, N. , Jones, A. W. ; Semantic Enterprise Application Integration Standards ; International Journal of Manufacturing Technology and Management (IJMTM) , April,
2006
[7] Green, S., Holman, K.; The Universal Business Language and the Needs of Small Business; iTSC Synthesis Journal 2004.
[8] Test Assertions Guidelines; OASIS TAG technical Committee, February 2009.http://www.oasis-open.org/committees/document.php?document_id=31076
×Durand, J., Kulvatunyou, S., Woo J.,and Martin, M. ; Testing and Monitoring E-Business using the Event-driven Test Scripting Language ; proceedings I-ESA (Interoperability of Enterprise Systems and Applications), April
2007
×Glushko, R., and McGrath, T. ; Analyzing and Designing Documents for Business Informatics and Web Services ; MIT Press, March 2008
×Durand, J. ; "Will Your SOA Systems Work in the Real World?” ; STAR-East, Software Testing Analysis and Review Conference, May 2007
×Lubell, J., Kulvatunyou, B., Morris, K.C., Harvey, B. ; A Tool Kit for Implementing XML Schema Naming and Design Rules ; Extreme Markup Languages Conference, August 2006, Montreal, Canada.
×Anicic, N. , Marjanovic, Z. , Ivezic, N. , Jones, A. W. ; Semantic Enterprise Application Integration Standards ; International Journal of Manufacturing Technology and Management (IJMTM) , April,
2006
×Green, S., Holman, K.; The Universal Business Language and the Needs of Small Business; iTSC Synthesis Journal 2004.