Introduction

Given the amount of XML in the world, and the amount of JSON, it is hardly surprising that there should be a high demand for good quality XML to JSON conversion tools. We needn’t go into great detail to understand why: XML is widely used for intra-enterprise and inter-enterprise dataflows, and has many strengths that make it well suited to that role; but JSON is easier to consume in conventional programming languages, and most especially in Javascript.

It’s also not difficult to see why most existing libraries do the job rather badly. There isn’t a good one-to-one fit between the data models. There are real tensions between the requirements that different libraries are trying to satisfy, and they have made different design compromises as a result, which the effect that the JSON they produce tends to please no-one.

We’re not the first to suggest that the key to doing a better job is to make the conversion schema-aware. In the XSLT/XPath world we have a real opportunity to deliver that, because we already have all the infrastructure for processing schema-aware XML. A conversion function that performs schema-aware conversion has recently been specified for inclusion in XPath 4.0, and we have written a prototype implementation. This talk will describe how it works, and compare its results to those produced by other established tools.

At the time of writing the specification of the proposed function, xdm-to-json(), is available only as a GitHub pull request[1].

Why do people convert XML to JSON?

It may seem like a naive question, but unless we understand what people are trying to achieve, we’re going to produce a poor design.

I[2] think there are two main scenarios we need to consider.

The first is when you’re sitting on a lot of XML data, and you need to provide that information to an established interface that requires JSON. You’re going to have to produce exactly the JSON that this interface expects, and it might look nothing like the XML that you start with. This is essentially the requirement that the xml-to-json() function in XSLT 3.0 was designed to satisfy. It’s likely that you’re not just changing the syntax of the way the data is represented, you’re probably doing a semantic transformation as well. You may well be filtering, grouping, sorting, and aggregating the data at the same time as you’re converting it from XML to JSON.

So what we did in XSLT 3.0 was to design an XML vocabulary that’s a literal translation of the JSON data model. If you want a JSON array, you create a <j:array> element; if you want a JSON object, you create a <j:map>, and so on. You then use all the XSLT transformation machinery at your disposal to create this XML-ified JSON, and call xml-to-json() to turn it into the real thing.

The assumption here, and I think it’s probably correct, is that if you’re working to a specification or schema of the JSON that you need to produce, you need a tool that gives you precise control, and xml-to-json() is such a tool.

It’s worth remembering though that there is an alternative: you can produce arrays and maps directly from your XSLT code, and then use the JSON serialization method to turn them into JSON. That’s probably a more attractive option in XSLT 4.0, which brings improvements in the facilities for manipulating maps and arrays.

In fact, the XSLT 3.0 xml-to-json() function is probably most useful in conjunction with its counterpart, json-to-xml(). As I tried to show in [Kay 2016] and [Kay 2022], doing JSON-to-JSON transformations in XSLT 3.0 can be surprisingly tricky, and sometimes the easiest solution is to convert the JSON to XML, transform it, and then convert it back to JSON. We’re aiming to address this in XSLT 4.0[3], but in the meantime, converting the JSON to XML, transforming the XML, and then converting back to JSON is sometimes the easiest way.

There’s a second scenario where XML to JSON conversion is needed, and that’s what I want to talk about in this paper. Here the reason for doing the conversion is not that you’re feeding the data to an existing system that requires a data feed in JSON, but simply because you (or your team-mates) find it more convenient to work with JSON data. The reasons for that convenience may be many and varied, and they may be real or perceived; the fact is, there are many people who want to do this, for a variety of valid reasons.

I’ll give one example. In Saxonica we developed an XML representation of compiled stylesheets (called stylesheet export files — SEFs), that could be exported from an XSLT compiler on one platform, and delivered to a run-time engine on another. One of those run-time platforms is the web browser, where we need to consume the data in Javascript[4]. There are two ways to consume XML in Javascript, and both are a bit of a nightmare. One is to build a DOM tree in memory, and navigate the DOM every time you want to access some data. The other is to build a DOM tree in memory, and then bulk convert it into native Javascript data structures (objects and arrays) for subsequent manipulation.

We started with the first approach, but it’s very clumsy. To get a property of an object, you don’t want to write x.getAttributeValue("y"), you want to write x.y. The Javascript code for doing complex navigation is verbose, and the XPath equivalent is slow. Add to that, attribute values can only be strings, not integers or booleans, so there’s a lot of conversion involved every time you access an attribute.

Converting the XML to Javascript data structures on receipt confines the complexity of the DOM code to one small part of the application. But why do it at all? If we convert the XML to JSON server-side, we don’t need any code at all on the client to get it into native Javascript form; it just happens.

So we decided to convert the XML to JSON (in fact, we do this on-the-fly in the back end of the compiler; the XML is there, but it never sees the light of day); and we had complete freedom of choice over what the JSON should look like. At the risk of going off at a bit of a tangent, it may be worth showing the mapping that we use.

Here’s an example. If we’re compiling the XPath expression count($x) = 1, the XML representation of the compiled code in SEF format looks like this:

                  
<arith role='action' baseUri='file:/Users/mike/.../test.xsl' 
       ns='xsl=~ xml=~' line='4' op='+' calc='i+i'>
   <fn name='count'>
      <gVarRef name='Q{}x' bSlot='0'/>
   </fn>
   <int val='1'/>
</arith>

Let’s explain this briefly.

The outermost element <arith> says we’ve got an arithmetic expression. op="+" says it’s an addition; calc="i+i" says it’s adding two integers. The role attribute indicates where this expression fits in as an operand of the enclosing construct, which we haven’t shown in this specimen. The ns attribute is namespace information, and the rest is for diagnostics.

The <arith> expression has two operands, represented by child elements. The first operand is a function call on fn:count, and the second is the integer literal 1 (one); both of these should be fairly self-evident. (The bSlot attribute identifies the slot allocated to the global variable.)

When we output the same compiled expression as JSON, this is what we get:

                  
{
   "N": "arith",
   "role": "action",
   "baseUri": "file:/Users/mike/.../test.xsl",
   "ns": "xsl=~ xml=~",
   "line": "4",
   "op": "+",
   "calc": "i+i",
   "C": [
      {
         "N": "fn",
         "name": "count",
         "C": [
            {
               "N": "gVarRef",
               "name": "Q{}x",
               "bSlot": "0"
            }
         ]
      },
      {
         "N": "int",
         "val": "1"
      }
   ]
}

It’s a very straightforward and mechanistic mapping of the XML. Every element becomes a JSON object (map). The attributes of the element become properties in that map. The element name becomes a property named "N", and the children of the element turn into an property named "C", whose value is an array of objects corresponding to the child elements.[5]

For our particular purposes, that mapping works well. But if we try to apply the same mapping to other XML vocabularies, it doesn’t work so well at all, and I’ll try to explain why in the following sections.

Before we do that, however, let’s see what some other XML-to-JSON converters do with this XML. There are plenty of free online converters to choose from, and most of the half-dozen that I tried produced very similar results. I won’t mention one or two whose output was excruciatingly bad.

Here’s the result from one such converter[6]:

                  
{
  "@role": "action",
  "@baseUri": "file:/Users/mike/Desktop/temp/test.xsl",
  "@ns": "xsl=~ xml=~",
  "@line": "4",
  "@op": "+",
  "@calc": "i+i",
  "fn": {
    "@name": "count",
    "gVarRef": {
      "@name": "Q{}x",
      "@bSlot": "0"
    }
  },
  "int": {
    "@val": "1"
  }
}

A couple of points to note here. Firstly, it’s dropped the outermost element name, "arith" (some of the converters produce an outer wrapping map with a single property that retains the element name). But more importantly, the conversion has lost information about the order of the operands of the arithmetic expression (because once the JSON has been loaded into a Javascript object, the order of properties has no significance). This might not matter for addition, but it would certainly matter for subtraction.

If we do a little experiment to change the expression to count($x)+count($y), it gives us a completely different structure:

                  
"fn": [
    {
      "@name": "count",
      "gVarRef": {...}
    },
    {
      "@name": "count",
      "gVarRef": {...}
    }
]

What has happened here is that because both the operands of the arithmetic expression are function calls (<fn> elements), it has assumed the structure is homogenous, and has given us an array.

If we try it on an expression with three operands such as concat(name(), 1, name()), it puts the two function-call operands into an array in one property ("fn"), and the integer argument into a second property ("int"), completely losing information about the order of elements in the XML.

We can summarise the problem like this: it’s guessing what the semantics of the object model are that lie behind the lexical XML, and it’s guessing wrong.

And the reason that our actual mapping for SEF files works better is that we didn’t have to guess what the semantics were, we knew the semantics when we designed the mapping.

I tried half-a-dozen other free converters and most of them produced very similar output, with the same structural problems. Indeed, I’m sorry to say that our friends at Oxygen have an online converter with exactly the same faults. It seems to be a common assumption that if you want to convert your XML to JSON, then you don’t care what order your elements appear in.

Perhaps the reasoning behind this is that people who want their XML data converted to JSON are llikely to be using the kind of data that JSON is well suited to? Well perhaps that theory works for some data, but it doesn’t work for ours.

A naive response to these problems would be to say that in XML, the order of elements is always signficant, so it should always be retained. But that’s simply not the case. If you’ve got XML that looks like this:

                  
<location>
   <longitude>03° 19′ 20″ W</longitude>
   <latitude>50° 37′ 45″ N</latitude>
</location>

then the order of children really doesn’t matter, and it’s fine to convert it to

                  
{
  "location": {
    "longitude": "03° 19′ 20″ W",
    "latitude":  "50° 37′ 45″ N"
  }
}

rather than to

                  
{
  "location": [
    {"longitude": "03° 19′ 20″ W"},
    {"latitude": "50° 37′ 45″ N"}
  ]
}

Clearly to achieve the best possible conversion, we need a converter that can be given more information about the data model: for example, whether or not the order of elements is significant.

Goessner’s Seven Flavors

Stefan Goessner, at [Goessner 2006], published an interesting analysis identifying seven "flavors" of XML element, each of which requires a different mapping to JSON. Putting my own gloss on his seven flavors, and giving them my own names, they are:

  • Empty elements ("empty").

    <br/> becomes {"br":null} (or, if you prefer, {"br":""} or {"br":[]} or {"br":{}}).

  • Empty elements with simple text content ("simple").

    <name>John</name> becomes {"name":"John"}.

  • Empty elements with attributes ("empty-plus").

    <range min="0" max="10"/> becomes {"range":{"@min":0,"@max":10}. (Do we need to retain the @ sign for properties derived from attributes? Probably not, but it has become a convention.)

  • Elements with text content plus attributes ("simple-plus").

    <length unit="cm">20</length> becomes {"length": {"@unit":"cm", "#text":"20"}}.

  • Elements whose children have distinct names ("record").

    <loc><lat>50° 37′ 45″ N</lat><long>03° 19′ 20″ W</long></loc> becomes {"loc":{"lat":"50° 37′ 45″ N", "long":"03° 19′ 20″ W"}}.

  • Elements whose children all have the same name ("list").

    <ol><li>x</li><li>y</li></ol> becomes {"ol":{"li":["x","y"]}}.

  • Elements with mixed content ("mixed").

    <p>H<sub>2</sub>O</p> becomes (perhaps) {"p":["H", {"sub":"2"}, "O"]}. Goessner initially suggests {"p":{#text":["H", "O"], {"sub":"2"}]} and then points out that this is obviously inadequate; unfortunately some of the online converters appear to have followed his first suggestion without reading to the end of the article. Goessner ends up suggesting retaining the mixed content as XML, so the result becomes {"p":"H<sub>2</sub>O"}

My first observation on this analysis is that it’s useful but clearly incomplete: it doesn’t include the case I presented earlier, where we have element-only content, in which the order of elements is significant. We can treat that as a special case of mixed content, but the semantics are rather different.

What these categories reveal is that there are different ways of using XML to represent the data in an object model, and it’s not in general possible to reverse-engineer the object model from the XML representation. But to produce the most usable representation of the data in JSON, we do need an understanding of the data model that the XML represents.

The analysis also prompts questions about the role of element and attribute names. If we think in terms of entity-relationship modelling, then in JSON, ERM entities clearly correspond to JSON objects, and ERM attributes correspond to JSON properties. In XML, ERM entities translate to elements and ERM attributes might be represented either as XML attributes or as child elements. So what role do XML element names play?

It seems that XML element names usually fall into one of two categories. Sometimes the XML element name denotes the real-world class of thing that it is representing: <p> represents a paragraph, <address> represents an address. Other times, the XML element name identifies a property of the parent element: <longitude> is a property of a geographic notation, not a type of entity in its own right. Sometimes an element name can serve both purposes at once (we can think of <city> both as a property of an address, and as a type of entity in its own right). Sometimes we need to name both the class and the property (or relationship), and there are two popular conventions for doing this: in the SEF tree described earler, we represent an arithmetic expression as <arith role="action">..</arith>, but other vocabularies for similar data use <action type="arith">...</action>. Both are perfectly legitimate in XML, but to generate good JSON, you need know which convention is being used.

Sometimes element names are completely redundant. It really doesn’t matter whether the child elements of <list> are called <item> or <li> or <_>; the name tells us nothing, and is only there because XML requires elements to be named. (Nevertheless, XML element names are useful handles when it comes to writing XSLT patterns or XSD type declarations.) This is one reason that XML-to-JSON converters have trouble knowing what to do with the outermost element name in the document: names in JSON are always property names, and there is no natural place in JSON to put a name that is being used primarily as a type name.

Choosing a Flavor

If we decide that every element should be classified into one of Goessner’s seven flavors, and that this should determine the JSON representation used, we immediately hit problems. Consider for example an element that has exactly one element child: <x><a>23</a></x>. Is this an example of flavor 5 ("record"), or flavor 6 ("list"), or is it perhaps a degenerate case of flavor 7 ("mixed")?

To make the JSON as usable as possible, we want to ensure that all <x> elements are converted in the same way. If every <x> element fits the "list" flavor, then we should use the "list" flavor consistently. It gets very inconvenient for downstream processing if lists of two or more items are represented using arrays, but lists of length 0 or 1 are represented in some other way. But this implies that we make the decision based not on one individual <x> element, but on a large sample of <x> elements. We could consider examining all the <x> elements in the document being converted; but we could also go further, and look at the general rules for <x> elements that might be found in a schema.

One way of forcing a particular flavor for particular elements is to use data binding technology to map the XML to data structures in a programming language such as Java or C#, and then to serialize these data structures into JSON. The mapping from XML to Java or C# (often called "marshalling") can be controlled using an XML schema, augmented with annotations either in the XML schema, or in the Java/C# code, or both. Some tools, indeed, allow elements in the input document to be annotated with attributes in a special namespace to steer the conversion.

One of the most popular tools for serious XML to JSON conversion is the Java Jackson library[7], which adopts this approach. It undoubtedly gives a great deal of control, but the downside is that it’s a two-stage approach: first the XML has to be mapped to Java classes, then the Java classes need to be serialized (or "unmarshalled") to JSON. It also inherits the biggest drawback of XML data binding technology, which is that it becomes very inflexible if the schema for the data evolves over time.

The Newtonsoft Json.NET library[8] offers similar functionality for the C# world.

A new XML-to-JSON converter for XSLT 4.0

We’ve seen that the existing xml-to-json() conversion function in XPath 3.1 essentially relies on a two-stage process: first the XML is transformed "by hand" into an XML vocabulary that explicitly represents the desired target JSON using elements such as <j:array>, <j:map>, <j:string>, and <j:number>; and then the xml-to-json() function is called to serialize this structure as lexical JSON.

For 4.0 we want to provide something that’s rather less effort to use, even if it doesn’t offer the same level of control. In this section I will present details of the function that the XSLT 4.0 Community Group is working on (the working name is "xdm-to-json", but may well change). This is work in progress and the specification is subject to change.

This mapping is designed with a number of objectives:

  • It should be possible to represent any XML content (including mixed content) in JSON.

  • The resulting JSON should be intuitive and easy to use.

  • It should be possible to get good results when all options are defaulted, and perfect results by manual tweaking of options.

  • The JSON should be consistent and stable: small changes in the input should not result in large changes in the output.

Achieving all these objectives requires design compromises. It also imposes constraints. In consequence:

  • The conversion is not lossless.

  • The conversion is not streamable.

  • The results are not necessarily compatible with those produced by other popular libraries.

I’ll consider here only the most complex part of the function, which is the conversion of element nodes to JSON maps.

The conversion selects one of 12 "layouts" for converting each element. These are based on Goessner’s seven flavors, with some refinements and additions:

  • For handling mixed content, we provide the option of serializing the content as lexical XML, XHTML, or HTML, wrapped inside a JSON string (giving three additional flavors). This is often a useful way of handling small quantities of rich text appearing within structured data, such as product descriptions in a product catalog.

  • We provide two variations on list content (elements whose children are all elements with the same name): "list" doesn’t allow attributes, and can therefore drop the child element names, while "list-plus" allows attributes and retains them. In "list" layout <list><a>alpha</a><a>beta</a><a>gamma</a></list> becomes {"list":["alpha", "beta", "gamma"]}, while in "list-plus" layout, <list class="c"><a>alpha</a><a>beta</a><a>gamma</a></list> becomes {"list":{"@class":"c","a":["alpha", "beta", "gamma"]}}.

  • We add "sequence" layout for elements with element-only content, where the order of children is signficant and duplicates are allowed. An example might be sections comprising a heading plus an unbounded sequence of paragraphs. Another example would be the SEF syntax tree we presented at the start of this paper.

The proposed function provides four approaches to selecting the most appropriate layout for each element.

  • Automatic selection based on the form of each element (for example, how many children it has, whether they are uniformly or uniquely named, whether there are any attributes). Specifically, we define an XSLT pattern for each layout, and the first layout whose pattern matches the particular element node is selected.

  • "Uniform" selection: this is similar, but relies on examining all the elements with a given name, and choosing a single layout that works for all of them. This approach ensures that a list with zero or one members is formatted in the same way as a list with two or more members. The decision is still made automatically, but it involves rather deeper analysis, with a corresponding performance penalty. It’s worth noting a limitaion: while this approach will choose a consistent JSON representation for all elements within one input document, it won’t ensure that two different input documents will be converted in the same way.

  • Schema-aware selection. Here the selection is based not on the actual structure of element instances in the input document, but rather on their XSD schema definition. This approach therefore relies on schema-aware XPath processing. There are two main advantages over "uniform" selection: firstly, using the schema ensures that the same JSON representation is used consistently across all documents being converted. Secondly, the schema provides some hints as to the underlying object model; for example, it’s a reasonable assumption (though not guaranteed) that if the schema requires elements to appear in a particular order, then order is significant, while if it allows any order, then it isn’t. I’ll talk more about schema-aware layout selection in the next section of the paper.

  • Explicit manual selection. The proposed function provides a parameter that allows a specific layout to be chosen for a given element name, overriding any automatic choice based on either the instance structure or the schema.

The detailed rules will appear in the language specification in due course; interested reader can track the evolving drafts, and can try out the experimental implementation in Saxon version 12.3; but both the specification and the implementation are at the time of writing subject to change.

Schema-Aware Layout Selection

An XSD schema does not provide complete information about the semantics of particular elements; for example, it does not indicate whether the order of the element’s children is significant or not. Nevertheless, the choices made in writing a schema give useful clues. And even where the choices made are imperfect, at least they will be consistent, not only for different elements in the same document, but also across documents.

The decision over which layout option to use for each element is based on the properties of the element’s type annotation. This is a property added to each element during the process of schema validation, which essentially records which schema type the element was validated against. Most of the time, all elements with a particular name will have the same type annotation, but there are exceptions, for example when the schema includes local element declarations, or when xsi:type attributes are used in the instance document.

The JSON layout is chosen as follows:

  • empty - chosen when the schema type has an empty content model and allows no attributes.

  • empty-plus - chosen when the schema type has an empty content model and allows one or more attributes.

  • simple - chosen when the schema type is a simple type (that is, no child elements or attributes are allowed)

  • simple-plus - chosen when the schema type is a complex type with simple content (that is, attributes are allowed but element children are not)

  • list - chosen when the schema type has an element-only content model allowing only one child element name, where attributes are not allowed

  • list-plus - as with list, but attributes are allowed

  • record - chosen when the schema type as an xs:any content model, suggesting that the order of child elements is not significant. Attributes do not affect this choice

  • sequence - chosen chosen when the schema type has an element-only content model allowing multiple element names, with multiple occurrences of each. Again, attributes do not affect this choice

  • mixed - chosen when the schema type has mixed content.

Names and Namespaces

Like all the popular libraries we examined, the proposed xdm-to-json function represents an element as a name-value pair within a JSON object - often a singleton object. I considered using the SEF file mapping introduced earlier in this paper, whereby the element name is treated as if it were the value of an attribute (so <e x="1"/> becomes {"name":"e","@x":"1"}, but rejected this largely because the convention of mapping element names to JSON property names seems solidly established and anything else would be considered unusual.

For namespaces I felt that it was important to retain information about namespace URIs (but not namespace prefixes). But I wanted to make namespace URIs as non-intrusive as possible. The solution adopted is that if an element is in a namespace, the corresponding JSON key is in the form "Q{uri}local", except when the element is in the same namespace as its parent, in which case the URI is omitted. For most documents this means that a namespace URI will appear only (if at all) on the outermost JSON property name.

Appendix A. Examples with no Schema

The following examples illustrate the result of converting some simple XML inputs using the experimental implementation of the xdm-to-json function issued in SaxonJ 12.3. These examples deduce the mapping to use from a standalone XML instance, with default settings and with no help from a schema.

Table I

Example XML to JSON mappings

XML JSON
                                 
<e/>
                                 
{"e":""}
                                 
<e foo="1" bar="2"/>
                                 
{ "e":{
    "@foo": "1",
    "@bar": "2"    
  } }
                                 
<e><foo/><bar/></e>
                                 
{ "e":{
    "foo": "",
    "bar": ""
  } }
                                 
<e xmlns="foo.com"><f/></e>
                                 
{ "Q{foo.com}e":{
    "f": ""
  } }
                                 
<e foo="1">bar</e>
                                 
{ "e":{
    "@foo": "1",
    "#content": "bar"
  } }
                                 
<e foo="1"><f>A</f><f>B</f><f>C</f></e>
                                 
{ "e":{
    "@foo": "1",
    "f": [
      "A",
      "B",
      "C"
    ]
  } }
                                 
<e foo="1"><f>A</f><g>B</g><f>C</f></e>
                                 
{ "e":[
    { "@foo":"1" },
    { "f":"A" },
    { "g":"B" },
    { "f":"C" }
  ] }
                                 
<e>H<sub>2</sub>O</e>
                                 
{ "e":[
    "H",
    { "sub":"2" },
    "O"
  ] }

Appendix B. Examples using a Schema

The following examples perform a schema-aware conversion, again using the experimental implementation of the xdm-to-json function issued in SaxonJ 12.3.

Example 1: A Complex Type with Simple Content

In these examples, the input data is validated against the schema declaration:

                     
<xs:element name="e">
    <xs:complexType>
        <xs:simpleContent>
            <xs:extension base="xs:decimal">
                <xs:attribute name="currency" use="optional"/>
            </xs:extension>
        </xs:simpleContent>
    </xs:complexType>
</xs:element>

Table II

Schema-Aware XML to JSON mappings for a Complex Type with Simple Content

XML JSON
                                    
<e currency="USD">12.34</e>
                                    
{ "e":{
    "@currency": "USD",
    "#content": 12.34
  } }
                                    
<e>12.34</e>
                                    
{ "e":{
    "#content": 12.34
  } }

The key point here is that the mapping is consistent in the two cases. Without the schema, the second example would be converted to {"e":"12.34"}, meaning that the receiving application would need different logic to access the content depending on whether or not the attribute is present. In addition, the content is output as a JSON number rather than a string, based on the fact that the schema type is xs:decimal.

Example 2: A Complex Type with Repeated Element Content

In these examples, the input data is validated against the schema declaration:

                     
<xs:element name="list">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="item" type="xs:string" 
                        minOccurs="0" maxOccurs="unbounded"/>
        </xs:sequence>
    </xs:complexType>
</xs:element>

Table III

Schema-Aware XML to JSON mappings for a Complex Type with Repeated Element Content

XML JSON
                                    
<list>
   <item>Alpha</item>
   <item>Beta</item>
   <item>Gamma</item>
</list>
                                    
{ "list":[
    "Alpha",
    "Beta",
    "Gamma"
  ] }
                                    
<list>
   <item>Alpha</item>
   <item>Beta</item>
</list>
                                    
{ "list":[
    "Alpha",
    "Beta"
  ] }
                                    
<list>
   <item>Alpha</item>
</list>
                                    
{ "list":[
    "Alpha"
  ] }
                                    
<list/>
                                    
{ "list":[] }

The key point again is that the mapping is consistent, regardless of the number of items. Without the schema, different mappings would be chosen in the case where the list is empty or contains a single item; with a schema, the same mapping is chosen in each case. In addition, note that a mapping has been chosen that drops the element name for the child elements (because it is entirely predictable), and that leaves no room for attributes at either the list level or the item level (because neither element can have attributes).

Appendix C. Achieving the SEF mapping in XSLT 4.0

At the start of this paper I presented the XML to JSON mapping used by Saxon’s SEF file format. This is a good example of a custom mapping, illustrating that if you know exactly what JSON format you want, an off-the-shelf converter is unlikely to deliver it.

The SEF mapping translates an element name into a property, so <e a="1"><f a="2"/></e> becomes {"N":"e", "a":"1", C":[{"N":"f","a":"2","C":[]}]}. Here "N" is short for "Name", and "C" is short for "Content" or "Children", whichever you prefer.

This mapping cannot be achieved using the proposed xdm-to-json function. Instead, it can be achieved with the following XSLT 4.0 code:

                  
xsl:output method="json"/>
<xsl:template match="*">
  <xsl:map>
    <xsl:map-entry key="'N'" select="local-name()"/>
    <xsl:for-each select="@*">
       <xsl:map-entry key="local-name()" select="string(.)"/>
    </xsl:map-entry>
    <xsl:map-entry key="'C'">
       <xsl:array>
          <xsl:apply-templates select="*"/>
       </xsl:array>
    </xsl:map-entry>
  </xsl:map>
</xsl:template>

It may be worth mentioning something we learned from working with this JSON format: if you want to produce human-readable JSON, you need to be able to control the order in which map properties are serialized. Specifically, the result is unreadable unless the deeply-nested C property comes last. ("C" stands for "content" or "children", whichever you prefer.) We introduced an extension to the specification for this purpose, the saxon:property-order serialization parameter. At the time of writing, this has not yet found its way into the draft XSLT 4.0 specification.

References

[Goessner 2006] Stefan Goessner. Converting Between XML and JSON. https://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html. May 31, 2006.

[Kay 2013] Michael Kay. The FtanML Markup Language. Presented at Balisage: The Markup Conference 2013, Montréal, Canada, August 6 - 9, 2013. In Proceedings of Balisage: The Markup Conference 2013. Balisage Series on Markup Technologies, vol. 10 (2013). doi:https://doi.org/10.4242/BalisageVol10.Kay01.

[Kay 2016] Michael Kay. Transforming JSON using XSLT 3.0. XML Prague, 2016. http://archive.xmlprague.cz/2016/files/xmlprague-2016-proceedings.pdf.

[Kay 2022] Michael Kay. XSLT Extensions for JSON Processing. Presented at Balisage: The Markup Conference 2022, Washington, DC, August 1 - 5, 2022. In Proceedings of Balisage: The Markup Conference 2022. Balisage Series on Markup Technologies, vol. 27 (2022). doi:https://doi.org/10.4242/BalisageVol27.Kay01.



[1] See https://qt4cg.org/pr/529/xpath-functions-40/autodiff.html; but this link is not likely to persist. For the current location of the draft specifications, visit https://qt4cg.org/ or search online for "XSLT 4.0".

[2] I have tried to consistently use "I" for my personal opinions, ideas and rationales, and "we" for work done as part of a team. If I have unduly claimed credit for ideas originated by others, or conversely, if have implied a group consensus when there is none, I apologize.

[3] I’m using "XSLT 4.0" throughout as a convenient shorthand for the family of specifications including XPath and XQuery 4.0, and the associated documents defining the data model, the function library, and the serialization module. Most of the features described will be available across the whole family.

[4] Saxonica offers an XSLT processor, SaxonJS, that is written in Javascript and runs in the browser. SaxonJS allows client-side execution of stylesheets that have been pre-compiled on the web server.

[5] The mapping was partly inspired by earlier work on FtanML, a markup language designed to combine the best features of XML and JSON: see [Kay 2013].

Michael Kay

Founder and Director

Saxonica

Michael Kay is the lead developer of the Saxon XSLT and XQuery processor, and was the editor of the XSLT 2.0 and 3.0 specifications. More recently he has been instrumental in establishing a W3C community group to create 4.0 versions of the XSLT, XQuery, and XPath languages. His company, Saxonica, was founded in 2004 and continues the development of products implementing these specifications. He is based in Reading, UK.