How to cite this paper

Galtman, Amanda. “Stretching XPath: Three Testing Tales: Beyond Primary Use Cases of Certain XML Functions and Standards.” Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August 2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). https://doi.org/10.4242/BalisageVol29.Galtman01.

Balisage: The Markup Conference 2024
July 29 - August 2, 2024

Balisage Paper: Stretching XPath: Three Testing Tales

Beyond Primary Use Cases of Certain XML Functions and Standards

Amanda Galtman

Amanda Galtman is an independent XML software developer and a maintainer of XSpec. She writes about XSpec at https://medium.com/@xspectacles. Previously, she was an XML software developer at MathWorks.

Copyright © 2024 by the author

Abstract

This paper describes three testing situations that can be addressed by markup-technology functionality whose primary uses are fairly different from those testing situations. One testing situation involves ensuring that certain kinds of test verifications don’t pass for the wrong reason. Another situation involves testing an XSLT template both cleanly and compactly. The third situation involves reducing maintenance of schema-valid and schema-invalid XML fragments when testing a schema. Software testing frameworks are sometimes sparse in their own functionality, but the XSpec and BaseX software discussed here can use rich functionality from the XML technologies on which they rest.

Table of Contents

Introduction
Tale 1: The Cupboard Was Bare
Example of (A), Tests Passing Despite Mistake
Example of (B), Mistake Hindering Failure Investigation
In Search of Prevention
XPath Syntax Options
XPath Syntax in the XSpec Examples
Tale 2: Good Functions Make Good Neighbors
Compact, Scalable Test with Neighboring Contexts
Preventing Interference Among Contexts
Tale 3: Schema and Variations
Creating the Variations
Using the Variations
Conclusion

Introduction

This paper tells three tales on a common theme: things motivated by one purpose are sometimes useful for a different purpose. When you have a software problem, you might suddenly realize that an obscure function you read about but have never used could be part of the solution. The other theme among the three tales is that they involve testing XML-related software. Two testing situations involve testing XSLT or XQuery using XSpec, while the third involves testing a schema using XSpec or BaseX. XSpec is an open-source software product for testing XSLT, XQuery, and Schematron code [XS]. BaseX is an open-source XQuery processor that includes functions and annotations for unit testing [B][UM].

If you use XSpec or BaseX for testing, you might be interested in the specific problems and solutions in these tales. Even if you don’t use those testing frameworks, you might find broader lessons that you can apply elsewhere.

Tale 1: The Cupboard Was Bare

The first tale is about XSpec tests for XSLT or XQuery code where a sequence is unexpectedly empty. Perhaps your test code tries to do something with the first paragraph of the third section of a document, but the document has only two sections. Empty sequences are a normal thing in XPath and the XML applications that use XPath, so you don’t necessarily get any feedback that the sought-after paragraph’s parent is nonexistent.

A sequence that is empty by mistake can lead to two categories of testing problems:

  1. Tests pass despite the mistake, but they might not be doing what you think they’re doing. Tests that silently neglect to serve their core purpose give you a false sense of security about the health and coverage of the code you’re testing.

  2. Tests fail due to the mistake, and the failures might be confusing for you to troubleshoot until you discover the mistake. This phenomenon is not as bad as the first one, but the confusion wastes your time.

Unexpectedly empty sequences can lead to unexpected results in non-testing situations, too, so the lesson we will learn about making assumptions explicit can be applied in other code where you use XPath.

Example of (A), Tests Passing Despite Mistake

Consider an XSpec scenario that calls an XSLT or XQuery function that produces a topic document having sections and paragraphs. XSpec stores the document in a variable named x:result. Suppose you expect at least three sections, and you want to verify that the third section does not contain any paragraphs. You can use XSpec syntax like one of these two <x:expect> elements:

Figure 1: Verifications that Assume Third Section Exists

                           
<x:expect label="The third section has no paragraphs"
  test="empty($x:result/topic/section[3]/para)"/>

<x:expect label="The third section has no paragraphs"
  test="$x:result/topic/section[3]/para" select="()"/>

The first syntax expresses a true/false condition using the empty function, and the verification passes if the condition is true. The second syntax uses the test attribute to filter the document down to a hypothetical paragraph in the third section of the topic and verifies that the filtering didn’t find anything—that is, the path expression led to the empty sequence that equals the select attribute.

When constructing these <x:expect> elements, presumably you thought $x:result/topic/section[3] would return at least one section. The label certainly describes the third section as if it exists. If it doesn’t exist, the verifications pass for the wrong reason, as follows:

  • The path expression, $x:result/topic/section[3] returns an empty sequence.

  • The longer expression, $x:result/topic/section[3]/para also returns an empty sequence.

  • The <x:expect> elements pass when you run the test.

  • You receive no feedback that existence of the third section was a false assumption.

  • You haven’t accomplished the objective of this test, which was to verify a characteristic of a section you thought existed.

Maybe the test really should have used a path like $x:result/topic/section[2]/para or $x:result/topic/descendant::section[3]/para, or a different value of the input argument to the function being tested. Maybe the XSLT or XQuery code has a bug that causes the third section to be missing. In any case, the lack of feedback makes it harder to discover the mistake.

Example of (B), Mistake Hindering Failure Investigation

Working with the actual result of a test scenario is not the only place where path expressions can have mistakes. Consider an XSpec scenario for testing an XSLT or XQuery function named f:proto-list that produces list markup with one list item for each node in an input parameter. The scenario loads an XML document from an external file named test-document.xml and uses an XPath expression to select nodes as the value of the parameter.

Figure 2: Parameter Assumed to Contain Elements

                           
<x:scenario label="Generate list of tables">
  <x:call function="f:proto-list">
    <x:param name="items" href="test-document.xml"
      select="topic/section/table"/>
  </x:call>
  <x:expect label="List with one item per table">
    <itemizedlist>
      <listitem>
        <para>Table 1. Size chart</para>
      </listitem>
      <listitem>
        <para>Table 2. Color choices</para>
      </listitem>
    </itemizedlist>
  </x:expect>
</x:scenario>

Suppose the document in test-document.xml has no <table> elements having the specified path, such as because the tables are all in subsections rather than first-level sections. In this case, the function parameter named items is an empty sequence, and the test probably fails. While troubleshooting, you might spend a lot of time investigating the XSLT or XQuery code in the function before realizing that the problem is either in the test’s <x:param> element or in the test document.

In Search of Prevention

The XSpec vocabulary is fairly small and does not have dedicated features to alert you when a path expression you provide evaluates to an empty sequence. Using the XSpec vocabulary, you have these options for detecting such an empty sequence:

  • Insert extra <x:expect> elements to verify that a path evaluates to something non-empty, such as the following code to augment Figure 1. However, the extra verification leads to repetition or clutter in the test scenario.

                                  
    <x:expect label="Confirm that there is a third section"
      test="exists($x:result/topic/section[3])"/>
  • Use the as attribute on some XSpec element, to declare a data type that can’t match an empty sequence (e.g., as="element()+"). For example, the <x:param> element below can replace the one in Figure 2.

                                  
    <x:param name="items" href="test-document.xml"
      select="topic/section/table" as="element(table)+"/>

    Declaring data types is generally a good idea. However, to attach an as attribute to an intermediate result, you’d have to declare it as a separate variable. Maybe that’s a good idea, too, or maybe it seems like overkill.

What if you want an unobtrusive way to ensure that an intermediate result isn’t empty, without adding extra elements to the test or extra rows in the test report?

XPath Syntax Options

It turns out that XPath already has concise syntaxes that provide a way to say, Alert me when this path evaluates to an empty sequence. You can slip them into any XPath expression that you use in XSpec. You can even use them with an intermediate result like a portion of a path expression, without having to define a separate XSpec variable.

  • The one-or-more function is a pass-through function for a nonempty input sequence but issues an error message for an empty input sequence.

  • The exactly-one function works the same way but is stricter in the way its name implies. You can use it where you expect a sequence of one item instead of zero or multiple items.

  • The treat as operator asserts that a sequence has a certain data type. Although the syntax is often less readable than the use of the functions mentioned above, an advantage of this operator is that it lets you express more than just cardinality.

Outside XSpec, these syntaxes are useful when an XPath processor performs static type checking and you want to promise the processor up-front that a sequence will satisfy a certain data type condition at run time. Making a strict processor relax its vigilance is a use case that comes up in the specification [KXP] and in authoritative books like those by Michael Kay [K2] and Priscilla Walmsley [W].

In the XSpec situation that this tale is about, avoiding a static type error from a strict processor is not what’s going on. Instead, it’s the other way around, where you want a lax processor to be vigilant and tell you when something doesn’t have the cardinality or data type you expect. Some references for these syntaxes, such as [K2], mention this other use case, and you might encounter non-testing situations where these syntaxes help you construct XPath expressions that alert you to severe missing-data problems. I learned about the testing value of these syntaxes from the XSpec lead developer, GitHub user AirQuick.

XPath Syntax in the XSpec Examples

Returning to the earlier code examples, you can insert a call to the exactly-one function to make sure you find out if the third section fails to exist:

Figure 3: Verifications that Confirm Third Section Exists

                           
<x:expect label="The third section has no paragraphs"
  test="empty(exactly-one($x:result/topic/section[3])/para)"/>

<x:expect label="The third section has no paragraphs"
  test="exactly-one($x:result/topic/section[3])/para" select="()"/>

Also, you can insert a call to the one-or-more function to make sure you find out if the function parameter turns out to be empty:

Figure 4: Parameter Confirmed to Contain Elements

                           
<x:scenario label="Generate list of tables">
  <x:call function="f:proto-list">
    <x:param name="items" href="test-document.xml"
      select="one-or-more(topic/section/table)"/>
  </x:call>
  <x:expect label="List with one item per table">
    <itemizedlist>
      <listitem>
        <para>Table 1. Size chart</para>
      </listitem>
      <listitem>
        <para>Table 2. Color choices</para>
      </listitem>
    </itemizedlist>
  </x:expect>
</x:scenario>

With these changes, if the exactly-one or one-or-more function produces an error message when the test runs, the error gives you clear and valuable information that addresses the two categories of testing problems in this tale.

Tale 2: Good Functions Make Good Neighbors

The second tale is about XSpec tests for XSLT code where you want to test a lot of tiny contexts that each produce a tiny result, and you don’t want the contexts and results to be dwarfed by the amount of test code overhead. An example I came across in a real project [US] was a set of three template rules in a certain mode. Each template matched an attribute node and produced an attribute node. Working together, these templates determined the value of the resulting (i.e., output) attribute, based on the value of the matching (i.e., input) attribute. The template shown in Figure 5 had the most general match attribute of the three templates. This template acted as a fallback, mapping 11 distinct input values to the same output.[1]

Figure 5: XSLT Template for Nonspecific Data Types

                        
<xsl:template match="@as-type" mode="assign-json-type"
  as="attribute(in-json)">
  <xsl:attribute name="in-json">string</xsl:attribute>
</xsl:template>
<!–- Other template rules match @as-type with specific values -->

What are some ways to test 11 miniature mappings? The following simple XSpec scenario for one of the 11 mappings uses eight lines:

Figure 6: Scenario that Verifies One Mapping, No Reuse

                        
<x:scenario label="as-type='uuid'">
  <x:context mode="assign-json-type" select="/*/@as-type">
    <value as-type="uuid"/>
  </x:context>
  <x:expect label="maps to in-json='string'" select="/*/@in-json">
    <any-element in-json="string" xmlns=""/>
  </x:expect>
</x:scenario>
<!-- Add 10 analogous scenarios -->

Doing the same thing 11 times uses 88 lines, which seems like a lot of code for something so simple. XSpec supports reusing scenarios either verbatim or with variable substitutions, and using those techniques can reduce the code from 88 lines to 71 or 52 lines, respectively. Here’s how the variable-substitution idea would look, where you would have 11 scenarios like the first one below, and they would all reuse the second scenario. (Assume the prefix v is bound to some user-defined namespace URI for test-specific variables.)

Figure 7: Scenario that Verifies One Mapping, Reuse with Variable Substitutions

                        
<x:scenario label="as-type='uuid'">
  <x:variable name="v:this-type" select="'uuid'"/>
  <x:like label="adaptable mapping"/>
</x:scenario>
<!-- Add 10 scenarios analogous to the one above -->

<x:scenario label="adaptable mapping" shared="yes">
  <x:context mode="assign-json-type" select="/*/@as-type">
    <value as-type="{$v:this-type}"/>
  </x:context>
  <x:expect label="maps to in-json='string'" select="/*/@in-json">
    <any-element in-json="string" xmlns=""/>
  </x:expect>
</x:scenario>

Even 52 lines might seem too long.

Compact, Scalable Test with Neighboring Contexts

We can reduce the test code to one 19-line scenario, by putting all 11 input items in a single <x:context> element, as follows:

Figure 8: Scenario that Verifies All 11 Mappings

                           
<x:scenario label="as-type attribute with 11 values listed">
  <x:context mode="assign-json-type" select="/*/@as-type">
    <value as-type="uuid"/>
    <value as-type="markup-line"/>
    <value as-type="dateTime-with-timezone"/>
    <value as-type="string"/>
    <value as-type="token"/>
    <value as-type="uri"/>
    <value as-type="markup-multiline"/>
    <value as-type="uri-reference"/>
    <value as-type="email"/>
    <value as-type="empty"/>
    <value as-type="base64Binary"/>
  </x:context>
  <x:expect label="Each attribute of the context maps to in-json='string'"
    select="for $i in ($x:context) return /*/@in-json">
    <any-element in-json="string" xmlns=""/>
  </x:expect>
</x:scenario>

In this scenario, the context is a sequence of 11 as-type attribute nodes, because the select="/*/@as-type" attribute is selecting the attribute nodes from the child elements of <x:context>. XSpec loops over the 11 nodes and applies templates to each. XSpec gathers the actual results together into a sequence of 11 in-json attributes, and that sequence is what the <x:expect> element uses for verification. If verification fails, the report would show the full sequences of actual and expected results, and we would need to determine which of the 11 comparisons failed; that extra labor is one reason I do not use multiple-item contexts in that many situations.

In this case, the expected value is identical for all the context nodes; it’s an in-json="string" attribute node. The <x:expect> provides this node in one child element and uses select="for $i in ($x:context) return /*/@in-json" to replicate the attribute node once per context item. The notation $x:context refers to a variable that XSpec populates with the 11-item context. (In XPath 4.0, an alternative could be select="replicate(/*/@in-json,count($x:context))".)

Now the scenario is fairly compact and would scale well if we needed to add a few more input values. This tale isn’t complete, though, because of an aspect of the scenario that might be problematic or seem philosophically questionable: the potential for the different items in the context to interfere with each other as the XSLT code accesses parts of the tree. Having the different items mingling in the XSpec markup makes the test code concise, but having them mingle when the XSLT runs is not desirable.

What makes interference during the XSLT execution a possibility is that the context is a sequence of 11 attribute nodes within a tree that includes all 11 elements as siblings of each other. To see evidence of this tree relationship, add the following message to the XSLT template and watch the console output from running the test.

Figure 9: Message to Show Contexts Have Tree Relationship

                           
<xsl:message expand-text="1">Preceding {
  local-name(..)} element has as-type={
  ../preceding-sibling::*[1]/@as-type/string()
  }</xsl:message>

Figure 10: Console Output Showing Tree Relationship

Preceding value element has as-type=
Preceding value element has as-type=uuid
Preceding value element has as-type=markup-line
Preceding value element has as-type=dateTime-with-timezone
Preceding value element has as-type=string
Preceding value element has as-type=token
Preceding value element has as-type=uri
Preceding value element has as-type=markup-multiline
Preceding value element has as-type=uri-reference
Preceding value element has as-type=email
Preceding value element has as-type=empty

Preventing Interference Among Contexts

How might a test make the attributes isolated, if they don’t start that way? XSpec lacks syntax for constructing new attribute nodes. XSpec does support helper functionality, such as an XSLT function that takes an attribute within a tree and creates a new, isolated attribute node.

However, a solution is even easier than that! In the XSLT 3.0 specification, the Streaming section lists functions named copy-of and snapshot that isolate parts of trees. While snapshot preserves the subtree’s ancestors and their attributes, copy-of returns the subtree only. Neither function includes siblings in its output. These functions are useful during stream processing, which imposes restrictions on access to nodes of a tree. Buffering a copy of a subtree in memory, with or without the ancestry of the subtree, enables freer access to it. The specification notes that each of these two functions is available for use (and is primarily intended for use) when a source document is processed using streaming. It can also be used when not streaming [K3].

This XSpec situation does not involve stream processing, and the two functions are useful not because of access restrictions or memory usage but for isolation and test cleanliness. If the <x:context> element in Figure 8 changes select="/*/@as-type" to select="/*/@as-type/copy-of()", the XSLT code sees isolated attributes instead of attributes of elements in a tree. The console messages from running the test no longer show the name of the element or its preceding sibling’s attribute value, because there is no element and hence no sibling element.

Figure 11: Console Output Showing Isolated Attribute Nodes

Preceding  element has as-type=
Preceding  element has as-type=
Preceding  element has as-type=
Preceding  element has as-type=
Preceding  element has as-type=
Preceding  element has as-type=
Preceding  element has as-type=
Preceding  element has as-type=
Preceding  element has as-type=
Preceding  element has as-type=
Preceding  element has as-type=

As a variation, changing select="/*/@as-type" to select="/*/@as-type/snapshot()" causes the XSLT code to see attributes that are attached to elements, but each element has no siblings. The console messages from running the test show the name of the element but not a preceding sibling’s attribute value, because there is no sibling element.

Figure 12: Console Output Showing Attribute Nodes with Ancestry

Preceding value element has as-type=
Preceding value element has as-type=
Preceding value element has as-type=
Preceding value element has as-type=
Preceding value element has as-type=
Preceding value element has as-type=
Preceding value element has as-type=
Preceding value element has as-type=
Preceding value element has as-type=
Preceding value element has as-type=
Preceding value element has as-type=

The point is that if siblings affected the XSLT template’s behavior in a way that could affect the test, using copy-of or snapshot in the XSpec code would take those siblings out of the view of the XSLT template. As a result, a test author would be able to write a compact scenario having a multiple-item context while preventing interference among the different items.

Tale 3: Schema and Variations

The third tale is about testing a schema (say, Schematron, RelaxNG, or XSD) with easier maintenance of the documents that support the tests. One way to test a schema is to validate a series of valid and invalid documents, and check that the validation results are what you expect. The valid and invalid documents might be related to each other. I like to test with invalid documents that are invalid for exactly one reason, which means each invalid document has a lot in common with a valid one.

If you follow that approach, you have a set of documents and a set of variations that make invalid documents into valid ones or vice versa. To ease maintenance, it would be nice to derive the variation documents programmatically from the originals, instead of doing manual copy-and-modify operations and then maintaining all the documents independently. Of course, the code that derives variations programmatically is something to maintain, so you want that code to be easier to maintain than the variations as independent documents.

You can certainly write some XSLT or XQuery code to create minor variations of documents. For instance, you can start with an identity transform and implement a system for specifying and then creating the variations you want. However, you don’t have to start from scratch, because there is already a standard way to create minor variations of documents. The XQuery Update Facility standard provides expressions that insert, delete, replace, and rename nodes. In addition to supporting that standard, the BaseX XQuery processor offers its own convenience operator for making updates with a streamlined syntax.

In the XQuery Update Facility 1.0 Requirements document [C], the first usage scenario is about updating persistent storage like a database. Other usage scenarios describe updates in the literal sense of bringing something up to date by adding new information or refreshing a status. While the requirements are not limited to time-oriented updates, newness is prominent in the descriptions. The testing usage in this tale originates from a different mindset. When creating variants of a document for schema testing, the point is not that a variant has fresher content but rather that it serves a different testing purpose compared to the original document. The variant is not necessarily better, and it’s up to you whether to programmatically produce an invalid variant from a valid document or a valid variant from an invalid document. You might pick a consistent direction of operation for your entire test suite or decide per document which direction is simpler to code.

Creating the Variations

Here are two examples of XQuery Update code for creating variants as persistent files, where the file:write and file:base-dir functions are specific to BaseX.

Figure 13: Renaming code as literal Makes a Valid Document Invalid

                           
copy $s := doc('original/valid-123.xml')
  modify (
  rename node $s/d:article//d:code
    as 'd:literal'
  )
  return file:write(
    file:base-dir() || "gen/invalid-123-literal.xml", $s
)

Figure 14: Moving abstract Makes an Invalid Document Valid

                           
copy $s := doc('original/invalid-456.xml')
  modify (
  insert node ($s/d:article/d:abstract)
    before $s/d:article/d:info/d:author,
  delete node $s/d:article/d:abstract
  )
  return file:write(
    file:base-dir() || "gen/valid-456-abstract-moved.xml", $s
)

Both examples use this three-step procedure:

  1. Read the original document using doc(), and keep a copy in memory.

  2. Modify the copy in memory, using a sequence of one or more expressions from the XQuery Update vocabulary. If the invalid documents are nearly valid by design, these expressions are likely to be simple and few. The first example renames <d:code> elements as <d:literal>, assuming the query contains a namespace declaration that binds the d prefix to the namespace URI that the document from step 1 also uses. The second example moves a <d:abstract> element, by copying it to some location and deleting the original.

  3. Write the result to a file different from the original file. Unlike some applications of XQuery Update, this situation does not modify the original file in place.

If you don’t mind using even more BaseX-specific functionality, you can streamline the syntax using the BaseX convenience operator, update [XB]. The three steps are the same, but they look a bit different in this syntax. Here is how the expressions above look using the update operator:

Figure 15: BaseX-Specific: Renaming code as literal Makes a Valid Document Invalid

                           
file:write(
  file:base-dir() || "gen/invalid-123-literal.xml",
  doc('original/valid-123.xml') update {
    rename node /d:article//d:code
      as 'd:literal'
  }
)

Figure 16: BaseX-Specific: Moving abstract Makes an Invalid Document Valid

                           
file:write(
  file:base-dir() || "gen/valid-456-abstract-moved.xml",
  doc('original/invalid-456.xml') update {
    insert node (/d:article/d:abstract)
      before /d:article/d:info/d:author,
    delete node /d:article/d:abstract
  }
)

Using the Variations

If you have a module of functions that each use this three-step procedure, you can run all the functions in the module to create all the variations you need. With the variation files in hand, you are ready to run validation tests against the set of original documents and generated documents. The validation tests themselves can use whatever testing functionality you like, such as the Schematron support in XSpec or the BaseX modules for validation and unit testing.

Advantages of separating the variant creation from the validation testing include:

  • Variant files in the file system are easy to inspect while you are fine-tuning your XQuery Update expressions and easy to validate manually while troubleshooting test failures.

  • You can create variant documents using one processor and run validation tests using a different one. For example, you can use BaseX to create variant documents and then use Saxon to run XSpec tests for a Schematron schema. (Saxon supports XQuery Update but requires an enterprise license.)

On the other hand, disadvantages include:

  • There are two processes to manage, and they must run sequentially.

  • The variant creation process needs write permission to create the files wherever the

If you think the disadvantages outweigh the advantages, you can combine the XQuery Update expressions with the validation tests themselves. Here, we show an example that illustrates the combination approach, using two BaseX modules: validation and unit testing.

First, we declare namespaces and a global variable that stores the path to a RelaxNG schema.

Figure 17: Initial Declarations

                           
module namespace mytest = "my-module-namespace-uri";
declare namespace d = "http://docbook.org/ns/docbook";
declare variable $mytest:rng-schema := 'balisage-1-5.rng';

Although it is not required, we find it useful to define helper functions that perform repeated tasks. These helper functions validate documents using the validate:rng-info function [V] and make test assertions using the unit:assert function [UM]. The validate:rng-info function returns a string sequence containing warnings and errors, if any. The test assertion hinges on whether this string sequence is empty (that is, the document is valid) or non-empty. In the unit:assert element, the second parameter is output in case of a test failure, so it should be something that helps you investigate the failure.

Figure 18: Helper Functions for Validation and Test Assertion

                           
declare function mytest:expect-valid(
$to-validate as item()
) as empty-sequence() {
  let $val := $to-validate => validate:rng-info($mytest:rng-schema)
  return
    unit:assert(empty($val), $val)
};

declare function mytest:expect-invalid(
$to-validate as item()
) as empty-sequence() {
  let $val := $to-validate => validate:rng-info($mytest:rng-schema)
  return
    unit:assert(exists($val), 'Valid when it should have been invalid')
};

If you start with a valid document and use XQuery Update to make it invalid, your test functions look like the following pair. The second function in the pair uses XQuery Update code identical to Figure 13. Each function uses one of the helper functions defined above.

Figure 19: Unit Tests for Valid Doc with Invalid Variation

                           
declare %unit:test function mytest:valid-sample-doc() {
  (: As is, the file is valid :)
  mytest:expect-valid('original/valid-123.xml')
};

declare %unit:test function mytest:xqupdate-makes-invalid-literal() {
  copy $s := doc('original/valid-123.xml')
    modify (
    (: Renaming <code> as <literal> makes the file invalid :)
    rename node $s/d:article//d:code
      as 'd:literal'
    )
    
    return
      mytest:expect-invalid($s)
};

Going in the other direction, if you start with an invalid document and use XQuery Update to make it valid, your test functions look like the following. The second function in the pair uses XQuery Update code identical to Figure 14.

Figure 20: Unit Tests for Invalid Doc with Valid Variation

                           
declare %unit:test function mytest:invalid-sample-doc-abstract() {
  (: As is, the file is invalid :)
  mytest:expect-invalid('original/invalid-456.xml')
};

declare %unit:test function mytest:xqupdate-makes-valid-abstract() {
  copy $s := doc('original/invalid-456.xml')
    modify (
    (: Moving <abstract> makes the file valid :)
    insert node ($s/d:article/d:abstract)
      before $s/d:article/d:info/d:author,
    delete node $s/d:article/d:abstract
    )
    
    return
      mytest:expect-valid($s)
};

If you run the test module in BaseX, the result looks like this:

Figure 21: BaseX Test Output for Passing Tests

                           
<testsuites time="PT0.049S">
  <testsuite name="file:/.../generate-in-memory-and-then-test.xqm"
    time="PT0.049S" tests="4" failures="0" errors="0" skipped="0">
    <testcase name="valid-sample-doc" time="PT0.007S"/>
    <testcase name="xqupdate-makes-invalid-literal" time="PT0.01S"/>
    <testcase name="invalid-sample-doc-abstract" time="PT0.007S"/>
    <testcase name="xqupdate-makes-valid-abstract" time="PT0.01S"/>
  </testsuite>
</testsuites>

If you cause deliberate test failures by interchanging mytest:expect-invalid with mytest:expect-valid, the failures produce <failure> elements like the following. The line and column numbers point to the helper functions (not extremely useful when all the functions use the same helper functions), while the child <info> elements provide some useful information.

Figure 22: BaseX Test Output for Failing Tests

                           
<testsuites time="PT0.048S">
  <testsuite name="file:/.../generate-in-memory-and-then-test.xqm"
    time="PT0.048S" tests="4" failures="2" errors="0" skipped="0">
    <testcase name="valid-sample-doc" time="PT0.007S"/>
    <testcase name="xqupdate-makes-invalid-literal" time="PT0.01S">
      <failure line="13" column="16">
        <info>file:/.../original/valid-123.xml, 20:98:
          element "d:literal" not allowed anywhere; expected the
          element end-tag, text or element "blockquote", "citation",
          "code", "email", "emphasis", "equation", "figure", "footnote",
          "informaltable", "inlinemediaobject", "itemizedlist", "link",
          "mediaobject", "note", "orderedlist", "phrase", "programlisting",
          "quote", "subscript", "superscript", "table", "trademark",
          "variablelist" or "xref"
        </info>
      </failure>
    </testcase>
    <testcase name="invalid-sample-doc-abstract" time="PT0.007S"/>
    <testcase name="xqupdate-makes-valid-abstract" time="PT0.009S">
      <failure line="19" column="16">
        <info>Valid when it should have been invalid</info>
      </failure>
    </testcase>
  </testsuite>
</testsuites>

Conclusion

Testing your code helps you maintain the code in a high quality state, but test files and their supporting files also require care and maintenance. These three tales illustrated specific ways to guard against a test passing for the wrong reason, to prevent interference among test cases that you place near each other for compactness, and to reduce copy/modify operations you perform manually among the documents you use for schema testing. At a higher level, the specific techniques illustrate how beneficial it is that the testing frameworks in the XML space rest upon the standards in that space. Even if your testing task is unrelated to static type checking, streaming, or data refreshing, your tests can benefit from functionality that those usage scenarios motivated.

References

[B] BaseX, https://basex.org/

[C] Chamberlin, Don and Jonathan Robie, Eds. XQuery Update Facility 1.0 Requirements, W3C Working Group Note 25 January 2011, https://www.w3.org/TR/xquery-update-10-requirements/

[K2] Kay, Michael. XSLT 2.0 and XPath 2.0 Programmer’s Reference, 4th Edition. Wiley: Indianapolis, IN, 2008.

[K3] Kay, Michael, Editor. XSL Transformations (XSLT) Version 3.0, https://www.w3.org/TR/xslt-30/

[KXP] Kay, Michael, Editor. XPath and XQuery Functions and Operators 3.1, https://www.w3.org/TR/xpath-functions-31/

[UM] “Unit Functions,” BaseX documentation, https://docs.basex.org/main/Unit_Functions

[US] US NIST metaschema-xslt repository, pull request 87, https://github.com/usnistgov/metaschema-xslt/pull/87/files. License: https://creativecommons.org/publicdomain/zero/1.0/

[XB] “Updates,” BaseX documentation, https://docs.basex.org/main/Updates

[V] “Validation Functions,” BaseX documentation, https://docs.basex.org/main/Validation_Functions

[W] Walmsley, Priscilla. XQuery, 2nd Edition. O’Reilly: Sebastopol, CA, 2015.

[XS] XSpec, https://github.com/xspec/xspec



[1] The sample code in this section is adapted from the NIST repository containing [US]. That repository is in the worldwide public domain.

×

Chamberlin, Don and Jonathan Robie, Eds. XQuery Update Facility 1.0 Requirements, W3C Working Group Note 25 January 2011, https://www.w3.org/TR/xquery-update-10-requirements/

×

Kay, Michael. XSLT 2.0 and XPath 2.0 Programmer’s Reference, 4th Edition. Wiley: Indianapolis, IN, 2008.

×

Kay, Michael, Editor. XSL Transformations (XSLT) Version 3.0, https://www.w3.org/TR/xslt-30/

×

Kay, Michael, Editor. XPath and XQuery Functions and Operators 3.1, https://www.w3.org/TR/xpath-functions-31/

×

“Unit Functions,” BaseX documentation, https://docs.basex.org/main/Unit_Functions

×

“Updates,” BaseX documentation, https://docs.basex.org/main/Updates

×

“Validation Functions,” BaseX documentation, https://docs.basex.org/main/Validation_Functions

×

Walmsley, Priscilla. XQuery, 2nd Edition. O’Reilly: Sebastopol, CA, 2015.

Author's keywords for this paper:
XSpec; BaseX; XSLT; XQuery; Schematron; Software testing; Schema testing