How to cite this paper
Sperberg-McQueen, C. M. “Visible / invisible.” Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August 2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). https://doi.org/10.4242/BalisageVol29.Sperberg-McQueen02.
Balisage: The Markup Conference 2024
July 29 - August 2, 2024
Balisage Paper: Visible / invisible
C. M. Sperberg-McQueen
Founder and principal
Black Mesa Technologies LLC
C. M. Sperberg-McQueen is the founder and principal of
Black Mesa Technologies, a consultancy specializing in helping
people use XML technologies.
He served as editor in chief of the TEI Guidelines from
1988 to 2000, and has also served as co-editor of the World
Wide Web Consortium’s XML 1.0 and XML Schema 1.1
specifications.
Copyright © 2024 by the author.
Abstract
Invisible XML seems to be attracting attention. What can we learn from it?
Table of Contents
- Introduction
- Invisibility as insignificance
-
- Broun
- Trough of disappointment?
- Will someone eat our lunch?
- Is markup unnecessary, because it can be automated?
- Segregation
- Invisibility as perfect infrastructure
-
- Tunnels
- Infrastructure
- Unbundling
- Invisibility as intangibility
-
- Power relations (what they don’t want you to know)
- Other absences
- Making things available for manipulation
- Seeing new things (visualization)
Introduction
Thank you all for coming. I have a request. Normally it doesn’t really matter to
me whether people have their video turned on during this closing or not, but today
I have the luxury of two screens, so I can look at my notes on one screen and see
you on the other. So if your bandwidth will support it and you are willing to be
seen, thank you.
As Jim [Mason] has hinted, it is my habit to try to talk about the conference that
has just ended. This year seems very clearly to be a year marked by invisible XML. Paper after
paper has talked about it, so it is perhaps not surprising that I have spent a lot
of time in the last few months thinking about invisibility and visibility.
Invisibility as insignificance
Broun
And a nagging thought kept coming to me about a book I once read about which, eventually,
I was able to remember enough so that I could find it and reread it. In 1919, an
American author named Heywood Broun wrote a children’s book. Heywood Broun was mostly
a sports writer, but he wrote at least one children’s book, The Fifty-first Dragon [Broun]. And the book is about a large, robust, but not over-bright boy named Gawaine le
Coeur-Hardy whose teachers are really not quite sure what to do with him and whose
headmaster eventually decides that maybe Gawaine could make a dragon slayer. So they
train Gawaine in the technique of slaying dragons. And the night before Gawaine is
to go out and face his first dragon, he has an interview with the headmaster in which
the headmaster, to calm his nerves, says, I will give you something that will protect you.
And Gawaine says, I’d like a magic cap.
And the headmaster said, What? A magic cap? What’s that?
A cap to make me invisible.
And the headmaster says, A cap to make you invisible. What would you do with it? Boy, you could walk from
here to London, and no one would so much as look at you. You couldn’t be more invisible than that.
That’s an important kind of invisibility; it’s the invisibility of insignificance.
You’re invisible because you don’t make any difference.
Trough of disappointment?
And I sometimes suspect that I may not be the only one who, at dark moments in my
life, occasionally worries about the diagram of the Gartner Hype Cycle that Michael
Kay talked about on Sunday in the pre-conference talk [Kay 2024], in which you start with a rising curve of hype and excitement about a new technology
in which people develop and encourage inflated expectations, and then you have a trough
that follows when people realize that the inflated expectations will not be met and
things have been over-promised, and so there’s disappointment. And Michael [Kay]
described a curve in which eventually the technology is applied in places where it’s
appropriate and it’s useful and used, and it gradually recovers, so its notional Y
value eventually rises again. But you know, not every technology does that. Some
just crash and disappear.
Will someone eat our lunch?
And in dark moments when I near despair, I sometimes worry, Are we invisible because we’re in that trough, and we’ll never get out of it?
Well, obviously, it’s not a dead technology; it’s a living and useful technology.
All you have to do is look at the talks — think about the talks you’ve heard here.
But, you know, we can temper that with the observation that there are lots of technologies
that are effectively dead that are still in use. There are working ALGOL 68 compilers,
and there are, therefore, people who currently are writing ALGOL 68 programs. But
I don’t think anyone expects ALGOL 68 to have a long future life or to make more difference
to the world of IT than it already has. And in those moments of worry, I sometimes
also worry, well, as the subtitle to Alex Miłowski’s talk suggested [Miłowski 2024], will someone steal our lunch?
Will people be able to use GQL to do all of the things that we can do with XSLT
and XQuery and more? Will GQL kill XQuery and XSLT?
Well, having experience of all the things I can do with XQuery and XSLT, I kinda doubt
it. But I also note that, if we are able to do with GQL everything we can do with
XQuery and XSLT and more, then those of us who, years ago, got the religion of application-independent
data representations will be in a much better place to exploit those capabilities
of GQL than people who locked their data into specific applications which no longer
run or into data with no established serialization format. It’s not enough to have
a good, coherent data model. The relational model is great and very clearly — very
explicitly — described, but when the database people showed up in the XML Working
Groups in the late 1990s, it wasn’t because XML was being widely hyped (although it
was at the time). They showed up because it met a need that they had: there is no
standard form for exchanging data exported from SQL databases. And I said, Wait a second. What about comma-separated values? Don’t they work?
And they said, They don’t have a character encoding declaration. We need XML because problems loading
CSV files are the single greatest cost to our customer support, and the usual cause
of problems is character set issues.
Is markup unnecessary, because it can be automated?
So maybe that’s not going to be an issue. But, then, as long as I’m worrying, maybe
markup will become unnecessary? Will markup become unnecessary because it can be
automated and AI can take care of it all? Paul Prescod and Phill Tornroth showed
us a system that transcribes a medical interview and then generates a report based
on it [Prescod and Tornroth 2024]. And no human hands touch it until the physician reviews and edits the report.
That’s remarkably close to what an enthusiast of AI — a researcher in AI — once told
me was the reason that markup was irrelevant — AI can handle it all. Not today, certainly
not when we had that conversation which was 10-15 years ago, but soon, and there will
be no need for markup. Well, I think he was wrong for reasons I’ll get to in a moment,
but maybe he’s right. Steve DeRose showed us a way to use AI to produce markup [DeRose 2024]. Uche [Ogbuji] showed us ways to make LLMs useful and and try to limit their hallucination
by giving toolboxes to the parrots [Ogbuji 2024].
But I don’t think AIs are likely to make markup irrelevant for a couple of reasons.
Maybe what will happen with AIs is what has happened with automatic translation.
Automatic translation, as you will have noticed if you’ve ever used it for a text
that you understand and care about, is not really ready for prime time. And there
is no organization that I’m aware of that actually relies on having documents — especially
normative legal documents — available in multiple languages that would dare rely on
automatic translation for the final form of those documents. They all use it for
the first draft because it’s a lot faster to have an automatic translation program
produce the first draft and then have a translator edit it than it is to have the
translator type the thing in from scratch and then edit it. So maybe AIs will produce
the first draft of markup, and humans will tweak it and improve it. And then markup
will serve as a communication medium between the automatic pass and the human pass,
and it will serve as a record for the hand-improved result.
Or maybe AIs will continue to improve, and they’ll be able to do markup perfectly.
And in that case, markup will still serve as a communications medium between software
and human beings, or between organizations, or between us and the future. And markup
will be an optimization because, yes, markup inflates the size of many texts and I/O
is slow and expensive, but I/O is not nearly as expensive as buying a new power station
to allow an AI to re-recognize markup that it already recognized last week. Doing
markup by AI is going to be computationally expensive because doing anything with
an AI is computationally expensive, and it will make sense to cache the results.
Also, we have reason to believe that markup will make LLMs perform better; Patrick
Durusau gave that as a conjecture yesterday during the Open Mic
session [Durusau 2024]. And the other day, Jean Paoli and Greg Renard made it as an observation: you
load data, stripping out the markup, and ask a question; you load the same data with
the markup, ask the same question, and get a dramatically better answer [Paoli et al. 2024].
So even in my dark moments, I don’t really fear that AI is going to eat my lunch
although I am extremely glad that there are people like Paul Prescod and his colleagues,
and Steve DeRose, and Uche [Ogbuji], who are trying to make sure that we have ways
of keeping the AIs honest, as it were, and checking their work and making sure that
we are not required to take it on faith.
Segregation
If XML sometimes seems invisible to the larger IT community, maybe there’s another
reason. Maybe it’s because sometimes we behave like a rather stand-offish group.
As Joe Gollner put it at yesterday’s Open Mic
session [Gollner 2024], Silicon Valley bros said in a meeting, ‘Well, we hate XML.’
And he said, It’s okay for you to hate XML because XML hates you, too.
And sometimes we act that way: we act as though we really don’t want to have anything
to do with non-XML data. And we will become more visible in the general IT community
if we engage more effectively with non-XML data, and that is part of the core story
of invisible XML as demonstrated by several talks. Joseph Courtney and Michael Gryk
talking about how to use iXML to parse magnetic resonance data scripts [Courtney and Gryk 2024]. Ari Nordström talking about how to use iXML to parse and edit and then re-serialize
s2000m messages in the aerospace industry [Nordström 2024]. Joe Gollner in his talk the other day showing examples of how XML can play a role
even in a larger production where there are many roles to be played and there are
many other actors, not just XML [Gollner 2024].
We have to get good at this because, otherwise, that quip about XML hating you too
will be a little too true to be comfortable. So I’m very glad that we have people
like Norm Tovey-Walsh and Debbie Lockett who showed us the other day in their sponsor
presentation on SaxonJS not only major, new functionality but dramatic improvements
to the foreign function interface [Tovey-Walsh and Lockett 2024]. Now, if you’re like me, you have seen the term foreign function interface before, and your eyes have always glazed over just a little bit, ’cause it has never
seemed particularly relevant. And it’s not particularly relevant to me because I
don’t tend to write multilingual applications; I tend to like being able to write
an application in a single language — either XSLT or XQuery — but I am not typical
of the IT industry as a whole in that way. And if we want to be visible and play
a role in the IT industry in general, we need to play better with others, and foreign
function interfaces are the way that that happens. They are the ground rules of the
kindergarten playground.
We’re not a church, even though we may get religion
and we may experience some kinds of euphoria that resemble those of religious conversion,
so we need to deal calmly and constructively with the fact that not everyone uses
XML. And to do that, it’s important not just to be able to move non-XML data into
XML so that we can use the XML stack, it is also important to be able to move it back. Now, several of the papers we saw this week have used manually constructed re-serializations,
and that’s always possible: XSLT is great for that; so is XQuery. But it would be
really nice to be able to automate it or automatically detect cases where it can be
automated and where it can’t, and for that, the work that John Lumley was talking
about is important [Lumley 2024], and the work that Stephen Pemberton reported at — I cannot remember now whether
it was Prague or Amsterdam — on round-tripping is important [Pemberton 2024].
Invisibility as perfect infrastructure
But there’s another kind of invisibility that has a slightly more positive valence,
and I want to talk about that. Really great athletes, it is often said, are characterized
by the fact that they make things look easy. Really great dancers are characterized
by the fact that they make it look natural, what they do. Typography, at least in
some schools of thought, is held to be best when it is least visible, or at least
non-irritating, as Tony Graham put it the other day [Graham 2024]. And a lot of people say about any infrastructure that they like: It just works.
I don’t have to think about it; it just works. Tony Graham talked about it. Mary
Holstege showed an example of making an infrastructure that will just work, by experimenting
with the opportunities offered by both your back-end library and by language construction
and by iXML as a tool for mediating between the two [Holstege 2024].
Tunnels
Perhaps the best example that I know of this kind of invisibility — and one of a certain
amount of symbolic importance, I guess, for the markup community — are the tunnels
under Disney World. Not everyone is aware of this, but beneath Disney World in Florida,
there is a network of tunnels, standardized size, large enough for electric vehicles
to pass each other in lanes, well-signed, and invisible to most visitors to the park.
It’s of symbolic importance to the markup community in part because those of you who
have been here long enough will have heard Old Timers talk about the time that Yuri
Rubinsky talked about the tunnels under Disneyland. So it’s part of our lore.
It is said that Walt Disney got the idea for tunnels when he was inspecting Disneyland
in Anaheim and he was offended to see a man in a buckskin jacket with fringes and
a coonskin hat and so on, walking through the streets of Tomorrowland, where he had
no business. And I’m not sure whether I’ve made this part of the story up, but the
way I think of it is he talks to the guy and says, What are you doing here?
, and he says My shift starts in Frontierland in five minutes; this is the only way to get there.
And so Walt Disney conceives the idea that he wanted the guy from Frontierland to
be able to get to his shift without being seen, so he needed an invisible tunnel.
Well, too late for Disneyland, but they were about to build Disney World, so he explains
to the engineers and the architects, I want a network of tunnels underneath the park.
And they said, Sorry, boss, not gonna happen: the water table here is about two feet deep, and there’s
really no way to keep tunnels dry. We can’t have tunnels.
Well, they do have tunnels. The way they have tunnels is they built the tunnels at
ground level and then they filled in dirt over them, the spoils from the lakes they
were digging and other water features. So the park — I have read — is twelve or fourteen
feet above the natural ground level at that area, so the tunnels remain dry. And
they’re standardized. They provide services: there’s always a garbage chute. Because
one of the things that’s invisible in Disney World at the surface level is garbage
collection. There are no garbage trucks; there are no over-full garbage bins waiting
to be emptied because there are no conventional garbage bins at all. There are little
slots through which you can toss garbage, and it goes into a pneumatic tube and it’s
carried to a collection point some miles away. There is standardized signage in these
tunnels so that people who aren’t terribly familiar with them can get to where they’re
going: they can reach Frontierland just by following the right colors. And there’s
medical attention and staff and all of the other kinds of infrastructure that you
need when you have that many people in that small a space over an extended period
and don’t want them to be distracted from the things you want to call to their attention.
So, invisibility can be good; it can be a sign of perfection. I think of it as a
kind of instantiation of the maxim you sometimes hear, that there is no limit to what
you can achieve if you don’t mind other people getting the credit.
Infrastructure
And if you want the kind of infrastructure about which people say It just works,
then you need to make it reliable. So you need people like Amanda Galtman working
to maintain testing infrastructures like XSpec — working to explain to those of us
who are not quite as ingenious how to use constructs that we wouldn’t have thought
of using to make tests that we would otherwise not have been able to make [Galtman 2024]. We need people like Mark Gross building industrial-scale systems to deal with
hundreds of millions of legal citations and to ensure the quality of large collections of XML data [Gross 2024]. And performance is crucial, so we need people like Alan Paxton and Adam Retter
building test beds to allow customers to evaluate XML databases and allow developers
of XML databases to test and improve their own performance [Paxton and Retter 2024].
Unbundling
On Monday, Tommie [Usdin] talked about the virtues of unbundling things — breaking
up bundles and selling the components [Usdin 2024]. And, of course, if you take something that’s been bundled and you break it up
into its pieces, then one of the immediate consequences is that, in principle at least,
it becomes possible to pick and choose. So I can pick these components from what
used to be this bundle and that component from that bundle. That works, of course,
only if they will interoperate — only if the corresponding pieces are interchangeable.
So there’s a certain irony here, I think. You have to expose the boundaries. You
have to make the boundaries between the parts of what was the bundle visible and standardize
them so that they can be used as the interface between what are now independent components.
That turns out to be more powerful and more useful than it may seem at first glance.
I don’t know whether anybody predicted that a specification whose grammatical requirements
can be summarized as: You delimit every tag with delimiters so that it’s easy to tell the difference between
tags and content, and you delimit every element with tags so that you can see where
the element begins and the element ends. And that’s basically all of the syntactic
rules.
I don’t know if anybody predicted that that would be a large enough — a wide enough
— space to stand to support any infrastructure at all. I would have thought, That’s almost vacuous, right?
But on the basis of that really thin kind of standardization, it turns out to be
possible to build information retrieval systems, document display systems, editors,
and all sorts of things without people having to agree on what the tags mean or anything
like that, so that we have a really surpri… what to me was, at least, a very surprising
combination of the ability to build a common infrastructure while still allowing different
people to be interested in and working with very different models of the world and
models of their data.
There’s also a certain irony in Tommie’s [Usdin] urging us to unbundle things because
it turns out that people don’t always want that [Usdin 2024]. At least in some cultures, and for some variants of human psychology, people want
a simple, careless experience. I want to pay the money and then not worry anymore.
I would really like to buy a ticket and have a seat and be able to check a bag and
and be greeted at the gate by someone friendly and so forth, and the ability of the
airlines to break those all out and charge separately for each one — well, even if
the total cost is not higher, I find that I tend to resent feeling as if I’m being
nickled-and-dimed to death, and other people apparently feel the same way.
I was once at a web services conference, and I asked the attendees, How many of you use a server from one vendor and client software developed from another
vendor?
And there may or may not have been one or two hands. But almost everybody in the
room used a server, software, and client software developed with the same toolkit,
which meant not that they were in danger of being locked in, but that they had already
been locked in by their vendors. The standardization of web services was almost completely
irrelevant to them.
If you pump gas at a convenience store in North America, I am told that the chances
that XML is going through the cables underneath your feet are directly proportional
to the likelihood that the cash register and gas pump were sold to the store by different
vendors. If they’re from the same vendor, then what’s passing through the cables
underneath your feet are proprietary messages, and they may or may not be XML. If
they’re from different vendors, what’s going through is XML, standardized by the petroleum
convenience store technology alliance (PCATS). And when people talk about schemas
as contracts, in that environment they are not speaking metaphorically, because when
the cash register throws an error on a message it receives from the gas pump, the
validity of that message against the schema is what determines whose field engineer
gets to go out and figure out what the problem is and fix it.
Invisibility as intangibility
Now, one consequence of our saying, of infrastructure that we like, that it just works
and it’s invisible is that we tend to think of infrastructure as invisible. Or rather,
we tend not to think about infrastructure at all. Karl Marx had a term for this;
he called it verdinglichung which is usually rendered in English as reification, which I find confusing for reasons I’ll get back to [Wikipedia, “Reification (Marxism)”]. It means: regarding a state of affairs as just the way things are. It’s just a thing; it has a nature. We cannot change the nature of that anymore
than we can change the nature of this rock or that tree. It is what it is.
Power relations (what they don’t want you to know)
Marx notes that this point of view is often encouraged by the powerful as a way of
thinking by the powerless, as a way of preventing the powerless from thinking about
social relations as something that could be different — something that could be changed.
So it’s a way of protecting a power advantage.
Now, the power relations between programmers and users — especially data users or
data owners — are perhaps not as dramatic and maybe not as clear as the relations
of economic power that Marx was interested in. But I find it very difficult to hear
programmers say, Oh, we don’t really like XML
without seeing a power dynamic at work because, of course, programmers gain power
from non-reusability of data. If my program is the only way that you can interact
with the data, then you’re dependent on my program; I’m one up. And if the data representation
is standardized so that other programs can also be used on it, then I’m one down,
as a programmer. I have less control. Also, if I don’t expose my data representations,
then I have sole control of them, and nobody can second guess me, which is appealing
for psychological as well as power reasons. Programmers gain from non-reusability
and non-standard data representations; users gain from exposed and standardized data
representations, and that is one reason, I think, that declarative markup and descriptive
markup so frequently feel to users like tools of liberation — like tools of enlightenment.
Now, the invisibility of things is not always so obvious. Hidden bias is hard to
see precisely because it’s hidden. What that means is, if we enter a world in which
we rely on programs, but there is no way to easily trace a direct line between a program
structure and its data, on the one hand, and its behavior, on the other, then any
bias in the construction of the program, or in the construction of the program’s data
store, is going to be very difficult to detect. I do not have enough linear algebra
to understand the bias involved in the data representations of any large language
model, and as far as I can tell, neither does anybody else, including the people who
build the large language models.
So this effect is very real. Things we don’t want to look at sometimes become effectively
invisible. We may sometimes be able to detect them indirectly. Take, as an example,
orchestra musicians. Symphony musicians in North America are typically well-educated,
open to the world, friendly, not the kind of people we typically associate with ethnic
or gender bias. And so there was no particular reason to suspect that North American
symphony orchestras suffered from, or exercised, bias in hiring musicians. After
all, we hire musicians for the sound they make, not for the way they look or anything
else. But some years ago, orchestras changed the way they ran auditions by putting
a screen in the room with the player auditioning for a position with the orchestra
on one side of the screen, and the players’ committee charged with choosing whom to
hire on the other side of the screen. So they could hear the auditioner playing,
but they couldn’t see the job applicant. And although there was no reason to believe
that orchestras suffered from hiring bias, when that happened, suddenly, you started
seeing female French horn players. Suddenly, you started seeing Black percussionists.
Now, I am sure that there is still a way to go before orchestras reach perfect equity
because different cultural backgrounds lead to tensions that cause problems and so
forth, but introducing that screen allowed orchestras to detect that they had been
suffering from bias and allowed them a way, partially at least, to correct that bias.
What I wonder is: Where can we find the screens we need to help us fight biases that
we don’t yet know we have, or even the biases that we suspect we might have, just
because we see them elsewhere in the world? It’s not hard to look at the screen in
front of you and see that by IT standards we have medium-good representation of women
in our markup community, and, of course, the star power of people like Tommie Usdin
and Debbie Lapeyre and Mary Holstege and Bethan Tovey-Walsh and so on may make us
think, Well, women are fully equal in this community.
But purely statistically, count the faces and names in front of you, and you will
see that we are some ways away from 50/50 gender equity.
And for other kinds of diversity (ethnic, cultural, neurodivergent), it is not hard
to see — it is hard not to see — that we are very, very far from achieving the kind of diversity and balance
that we ought to have. We would like our communities, our conferences, and our working
groups to be welcoming to everybody, and I’ve never seen anybody engaged in any overt
acts that I interpreted as being intended in a hostile way. But we can see from who’s
there and who’s not there that not everybody feels welcome. I have no idea what to
do about this. There is one thing, I think, we should work at not doing though, and that is pretending that we don’t have a problem. We have to think
about it; we have to work on it.
Other absences
And there are other people who aren’t here today, who are invisible on our screens
— people who been part of the community of those working on descriptive markup and
working to make the web a better place for years — whom we have lost. We lost a number
of people this year. I won’t attempt to list them, but I think it’s important that
we remember them. Fortunately, although their faces may be invisible in the zoom
window, we can see their traces in the technology all around us.
Making things available for manipulation
I don’t know about Marx, but Marxists tend, in my experience, to attribute every kind
of reification in the Marxian sense — every kind of verdinglichung — as a willful attempt by someone to hide things from other people. I’m not sure
that’s always true; sometimes I think it’s just hard to see things. And exposing
them — making them visible — makes them available for us to think about them, whether
they were consciously and intentionally hidden before or not. So, Liam Quin’s work
making Document Type Definitions accessible in XML is a perfect example of the kind
of enlightenment that we can achieve by making things seeable [Quin 2024].
Now, one of the reasons they haven’t been visible in the past is, I think, just the
timing issue. I think there may have been a very small window during which it would
have been possible to create an XML representation of Document Type Definitions.
Sorry, I got into that a little early; I need to say, they haven’t been visible because
they’re not exposed in the SAX and DOM interfaces. And the reason they’re not exposed
in the SAX and DOM interfaces, it says here, is it’s not obvious exactly how you would
want them to be exposed.
Document Type Definitions are different, and you would have had to have a whole new branch of the DOM and a
whole new branch of SAX, just the way the grammar of XML doubles in size when you
start talking about DTDs (as did the grammar of SGML). But if you had an XML representation
for Document Type Definitions as Liam [Quin] has provided and other people have provided
over time, then you have an obvious way to expose that information. It’s structured
information, and you expose it the way you expose any structured information: you
have XML elements and attributes; you expose them. And then you can argue about what the element names should be and what the structures
should be. But your APIs become simpler.
If we had done that right at the beginning, they might have been built into SAX and
DOM. But right at the beginning, although there were proposals for XML representations
of DTDs, they were — how can I say this politely? — politically divisive. And the
chair of the XML Working Group, Jon Bosak, made the Solomon-like decision that he
was going to try to avoid having the entire thing explode by not putting that question
to the Working Group. He didn’t give that to the Working Group as a choice, so, by
default, we continued with the existing syntax.
And by the time the XML Schema Working Group was formed a year or so later, it was
probably a little too late because at one of the early meetings of the Schema Working
Group, we asked, trying desperately to limit the scope of the Working Group so that
it would finish its work in the foreseeable future: One possibility would be just an XML version of DTDs. Another would be that, plus
object-oriented inheritance and other stuff. And another would be something else
even more ambitious. Who would be happy with just an XML version of DTDs as our first
deliverable? We can do the others later.
And there may have been two or three people who raised their hands. The problem
is there were fifty or sixty people in the room, so three people was really not going
to do it. And it didn’t happen, so DTDs were not exposed in the APIs. So they have
been invisible, but Liam [Quin] has shown us a way to make them visible [Quin 2024].
And the same thing is true of John Lumley’s emphasis on the fact that since there
is an iXML grammar for iXML grammars, we can work on iXML grammars in XML [Lumley 2024]. We can read an iXML grammar, and we can apply XML processing to do things with
it: to display it; to modify it; to make it define a related, but different, language;
and so on. We can achieve greater clarity just by making things visible.
Now, sometimes greater clarity is just as simple as dividing things in two. Syd Bauman
talked to us today about why two XPaths are better than one, and a lot of what he
said seems to me to have to do with the fact that having two XPaths allows you to
think of one as setting the context and the other as dealing with details [Bauman 2024]. Or, you know, it’s the setup, and then the serve. Or, however, you wanted to
describe it, dividing it into two pieces gives you a point of purchase at which you
can help control the complexity of what you’re trying to do. And controlling complexity
and understanding better what we’re doing is extremely useful.
Seeing new things (visualization)
I tend to conflate liberation with enlightenment, partly, I guess, because Kant defined
enlightenment as the exit of the human being out of a self-imposed minority or tutelage
— dependency [Kant]. Becoming self-responsible — becoming responsible for one’s self — entails being
self-aware, and that requires knowledge. So, anything that helps us improve our understanding
seems important to me, like the work reported by Allen Renear the other day on deviant
causal chains [Wei and Renear 2024]. If you’re like me, you may have found that large parts of that talk went right
over your head; it seemed very rarified. Logic chopping is the affectionate term that philosophers use for that style of analysis, and logic chopping is really important if you want to understand a concept. You know, it’s simpler
and more comfortable if we just say, X knows Y,
and we all know what knows means. It becomes, however, a richer concept if we say, No, knowledge can be divided into parts; there are sub-properties that we can attribute
in cases of knowledge.
And of course, when you do that, you may find, as Allen [Renear] illustrated, that
your analysis isn’t quite right, and clarifying that can feel tedious if clarifying
it is not on the short path between you and your immediate goals. But I think it’s
important because clarity is the basis of our progress.
Of course, sometimes the easiest way to make something visible is to visualize it.
And sometimes things are in principle already visible, but the patterns in them are
hard to perceive. You will have noticed that this tie has characters on it. If you’re
really perceptive, you will have noticed that this is an encoding of an XML declaration.
And if you are really, really attentive, you’ll know that there’s a typo, but not
everybody sees that at first glance. And sometimes a good way to make patterns visible
is to transpose the information in which they’re embedded into a different form, as
illustrated by Bethan Tovey-Walsh today in her talk about crocheting as a way of representing
patterns in a series of arbitrary input data [Tovey-Walsh 2024].
And the same is true of the pair of talks with which we began the conference after
Tommie’s [Usdin] introduction on Monday morning [Usdin 2024]. In literary studies and indeed in all text-based historical studies, there is
a huge invisibility. Textual variation — one of the most basic facts of texts — is
virtually invisible in literary criticism, or philosophy, or historical work with
documents. Main-line scholars really don’t like to think about it because they feel
the ground shifting under their feet. And so they don’t think about it, and they
don’t talk about it.
Textual editors work to make the variation visible by putting textual apparatus into
their editions, but of course, other editors immediately come along and make other
editions in which the variation is hidden by omitting the apparatus or by banishing
it to the back of the book where no one will consult it. And textual editing itself
is invisible, in the sense of being under-appreciated. Editors don’t get credit.
They don’t get tenure. There are all sorts of problems for textual editors.
And that is why I think the work that Ronald Haentjens Dekker and David Birnbaum reported
on Monday is of tremendous import for all of us who care about the preservation of
our cultural heritage [Birnbaum and Haentjens Dekker 2024]. How can we make textual variation and the patterns it makes easier to see, easier
to name, easier to think about, and easier to work with? They have a dragon well
worth slaying, and they didn’t need an invisibility cap. What they needed and what
they had was a way to make things visible. Thank you all for Balisage 2024. Come back next year.
References
[Bauman 2024] Bauman, Syd. Two Paths are Better than One.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Bauman01.
[Birnbaum and Haentjens Dekker 2024] Birnbaum, David J., and Ronald Haentjens Dekker. Visualizing textual collation: Exploring structured representations of textual alignment.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Birnbaum01.
[Broun] Broun, Heywood. The Fifty-first Dragon. Reprint. Englewood Cliffs, NJ: Prentice-Hall, 1968; Mankato, MN: Creative Education,
1985 [Acknowledgement: Heywood Hale Broun for The Fifty-first Dragon,
by Heywood Broun; adapted from The Collected Edition of Heywood Broun, © 1921, 1941, by Heywood Hale Broun. Used by permission of Bill Cooper Associates
Agency, Inc.].
[Courtney and Gryk 2024] Courtney, Joseph Michael, and Michael Robert Gryk. Pulse, Parse, and Ponder: Using Invisible XML to Dissect a Scientific Domain Specific
Language.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Courtney01.
[DeRose 2024] DeRose, Steven J. Can LLMs help with XML?
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.DeRose01.
[Durusau 2024] Durusau, Patrick. Fixing Mamba (at scale).
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Open Microphone
presentation.
[Galtman 2024] Galtman, Amanda. Stretching XPath: Three Testing Tales: Beyond Primary Use Cases of Certain XML Functions
and Standards.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Galtman01.
[Gollner 2024] Gollner, Joe. The Donut of Equivalence.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Open Microphone
presentation.
[Graham 2024] Graham, Tony. Printing Should ̶B̶e̶ ̶I̶n̶v̶i̶s̶i̶b̶l̶e̶ Not Be Irritating.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Antenna House sponsor presentation.
[Gross 2024] Gross, Mark. Ensuring XML quality and compatibility in large collections that span decades of content.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Gross01.
[Holstege 2024] Holstege, Mary. Invisible Fish: API Experimentation with Invisible XML.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Holstege01.
[Kant] Kant, Immanuel. An answer to the question: What is enlightenment?
In Practical Philosophy. Mary J. Gregor (ed). The Cambridge Edition of the Works of Immanuel Kant. Cambridge,
UK; New York: Cambridge University Press, 1999. pp. 11–22. doi:https://doi.org/10.1017/CBO9780511813306.005. ISBN 9780521654081. [English translation and commentary; the essay first appeared
in the Berlinische Monatsschrift, December, 1784.]
[Kay 2024] Kay, Michael. Why are some technologies more successful than others? And why are my predictions
usually wrong?
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Kay01.
[Lumley 2024] Lumley, John. Variations on an Invisible Theme: Using iXML to produce XML to produce iXML to produce
….
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Lumley01.
[Miłowski 2024] Miłowski, Alex. Graph Query Language - the new kid on the block!
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Milowski01.
[Nordström 2024] Nordström, Ari. Adventures in Mainframes, Text-based Messaging, and iXML.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Nordstrom01.
[Ogbuji 2024] Ogbuji, Uche. Give a Parrot a Toolbox and You Might Just Get an Ape.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Open Microphone
presentation.
[Paoli et al. 2024] Paoli, Jean, Zubin Rustom Wadia, and Gregory Rendard. KG-RAG, A Document Foundation Model Generating the Core XML Data Model and enabling
higher-quality RAG.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Docugami sponsor presentation.
[Paxton and Retter 2024] Paxton, Alan, and Adam Retter. Using a testbed to assess XML Database Performance: Integrating a NoSQL testbed into
the XML testing universe.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Paxton01.
[Pemberton 2024] Pemberton, Steven. Roundtripping Invisible XML.
Presented at XML Prague 2024, June 6-8, 2024. In Proceedings of XML Prague 2024. https://archive.xmlprague.cz/2024/files/xmlprague-2024-proceedings.pdf#page=163
[Prescod and Tornroth 2024] Prescod, Paul, and Phill Tornroth. Clean SOAP: Evaluating AI-based Structured Document Generation in a Medical Context.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Prescod01.
[Quin 2024] Quin, Liam. DTD (document type definition) declarations exposed in XSLT: Parsing DTD files in
XSLT to expose the definitions they contain.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Quin01.
[Tovey-Walsh 2024] Tovey-Walsh, Bethan. When women do algorithms: a semi-generative approach to overlay crochet with iXML
and XSLT.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Tovey-Walsh01.
[Tovey-Walsh and Lockett 2024] Tovey-Walsh, Norm, and Debbie Lockett. SaxonJS 3.0: Major new functionality!
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Saxonica sponsor presentation.
[Usdin 2024] Usdin, B. Tommie. Break up the Bundle; Sell the Components.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Usdin01.
[Wei and Renear 2024] Wei, Jingzhu, and Allen H. Renear. Deviant Causal Chains: A Problem for the Conceptual Modeling of Influence.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Wei01.
[Wikipedia, “Reification (Marxism)”] Wikipedia contributors. Reification (Marxism).
Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/wiki/Reification_(Marxism) (accessed August 29, 2024). (Text is available under the Creative Commons Attribution-ShareAlike License 4.0.)
×Bauman, Syd. Two Paths are Better than One.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Bauman01.
×Birnbaum, David J., and Ronald Haentjens Dekker. Visualizing textual collation: Exploring structured representations of textual alignment.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Birnbaum01.
×Broun, Heywood. The Fifty-first Dragon. Reprint. Englewood Cliffs, NJ: Prentice-Hall, 1968; Mankato, MN: Creative Education,
1985 [Acknowledgement: Heywood Hale Broun for The Fifty-first Dragon,
by Heywood Broun; adapted from The Collected Edition of Heywood Broun, © 1921, 1941, by Heywood Hale Broun. Used by permission of Bill Cooper Associates
Agency, Inc.].
×Courtney, Joseph Michael, and Michael Robert Gryk. Pulse, Parse, and Ponder: Using Invisible XML to Dissect a Scientific Domain Specific
Language.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Courtney01.
×DeRose, Steven J. Can LLMs help with XML?
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.DeRose01.
×Durusau, Patrick. Fixing Mamba (at scale).
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Open Microphone
presentation.
×Galtman, Amanda. Stretching XPath: Three Testing Tales: Beyond Primary Use Cases of Certain XML Functions
and Standards.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Galtman01.
×Gollner, Joe. The Donut of Equivalence.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Open Microphone
presentation.
×Graham, Tony. Printing Should ̶B̶e̶ ̶I̶n̶v̶i̶s̶i̶b̶l̶e̶ Not Be Irritating.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Antenna House sponsor presentation.
×Gross, Mark. Ensuring XML quality and compatibility in large collections that span decades of content.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Gross01.
×Holstege, Mary. Invisible Fish: API Experimentation with Invisible XML.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Holstege01.
×Kant, Immanuel. An answer to the question: What is enlightenment?
In Practical Philosophy. Mary J. Gregor (ed). The Cambridge Edition of the Works of Immanuel Kant. Cambridge,
UK; New York: Cambridge University Press, 1999. pp. 11–22. doi:https://doi.org/10.1017/CBO9780511813306.005. ISBN 9780521654081. [English translation and commentary; the essay first appeared
in the Berlinische Monatsschrift, December, 1784.]
×Kay, Michael. Why are some technologies more successful than others? And why are my predictions
usually wrong?
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Kay01.
×Lumley, John. Variations on an Invisible Theme: Using iXML to produce XML to produce iXML to produce
….
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Lumley01.
×Miłowski, Alex. Graph Query Language - the new kid on the block!
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Milowski01.
×Nordström, Ari. Adventures in Mainframes, Text-based Messaging, and iXML.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Nordstrom01.
×Ogbuji, Uche. Give a Parrot a Toolbox and You Might Just Get an Ape.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Open Microphone
presentation.
×Paoli, Jean, Zubin Rustom Wadia, and Gregory Rendard. KG-RAG, A Document Foundation Model Generating the Core XML Data Model and enabling
higher-quality RAG.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Docugami sponsor presentation.
×Paxton, Alan, and Adam Retter. Using a testbed to assess XML Database Performance: Integrating a NoSQL testbed into
the XML testing universe.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Paxton01.
×Prescod, Paul, and Phill Tornroth. Clean SOAP: Evaluating AI-based Structured Document Generation in a Medical Context.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Prescod01.
×Quin, Liam. DTD (document type definition) declarations exposed in XSLT: Parsing DTD files in
XSLT to expose the definitions they contain.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Quin01.
×Tovey-Walsh, Bethan. When women do algorithms: a semi-generative approach to overlay crochet with iXML
and XSLT.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Tovey-Walsh01.
×Tovey-Walsh, Norm, and Debbie Lockett. SaxonJS 3.0: Major new functionality!
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. Saxonica sponsor presentation.
×Usdin, B. Tommie. Break up the Bundle; Sell the Components.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Usdin01.
×Wei, Jingzhu, and Allen H. Renear. Deviant Causal Chains: A Problem for the Conceptual Modeling of Influence.
Presented at Balisage: The Markup Conference 2024, Washington, DC, July 29 - August
2, 2024. In Proceedings of Balisage: The Markup Conference 2024. Balisage Series on Markup Technologies, vol. 29 (2024). doi:https://doi.org/10.4242/BalisageVol29.Wei01.