How to cite this paper
Usdin, B. Tommie. “The (unspoken) XML gotcha.” Presented at Balisage: The Markup Conference 2021, Washington, DC, August 2 - 6, 2021. In Proceedings of Balisage: The Markup Conference 2021. Balisage Series on Markup Technologies, vol. 26 (2021). https://doi.org/10.4242/BalisageVol26.Usdin01.
Balisage: The Markup Conference 2021
August 2 - 6, 2021
Balisage Paper: The (unspoken) XML gotcha
B. Tommie Usdin
B. Tommie Usdin is President of Mulberry Technologies, Inc., a consultancy specializing
in XML for textual documents. Ms. Usdin has been working with SGML since 1985 and
has been a supporter of XML since 1996. She chairs the Balisage conference. Ms. Usdin has developed DTDs, Schemas, and XML/SGML application frameworks
for applications in government and industry. Projects include reference materials
in medicine, science, engineering, and law; semiconductor documentation; historical
and archival materials. Distribution formats have included print books, magazines,
and journals, and both web- and media-based electronic publications. She is co-chair
of the NISO Z39-96, JATS: Journal Article Tag Suite Working Group and a member of
the BITS Working Group and the NISO STS Standing Committee. You can read more about
her at http://www.mulberrytech.com/people/usdin/index.html.
Copyright ©2021, Mulberry Technologies, Inc. Used with permission.
Abstract
XML is a platform-neutral way to exchange, share, and manipulate information. But
what persuades many to use XML is the claim that XML provides a long-term way to store
information, independent of tools (both hardware and software) with their short life
spans. Projects spend significant resources on XML setup and then settle into doing
the real work, using that XML infrastructure to compile, write, analyze, or whatever
it is they do. Until, one day — something doesn’t work. Hardware is retired; software
is upgraded; specifications go into new releases. Users get stuck. And when they complain,
we respond, Of course that doesn’t work any more; you have been accumulating technical debt for
years! It is time to reinvest.
They thought they had committed to a one-time cost, and now we tell them that it
is an ongoing expense. If the user had put documents into their favorite spreadsheet,
they complain, they could still import them into the current version. How do we answer
that complaint? We (the XMLers) think we described the values of XML plainly and fairly.
We (the XML users) think that the claim that XML documents last a long time is relying
on a specious technicality, and we have been trapped dishonestly. I live on both sides
of this: as a user I want to invest in infrastructure once and have it last; as a
developer I want to be able to improve my product without the limitations imposed
by backwards compatibility. We as a community often complain that not enough people
are using XML. If we really want XML use to grow, we need to address the gotcha that
too many XML users are feeling.
Table of Contents
- Appendix A. Excerpts of
Introduction to XML for XYZ
(February, 2001)
Like many of us in the markup community — possibly most of the people here — I want
the users of declarative markup, which these days really means users of XML, to feel
comfortable. I want them to think that their investments in XML are appropriate and
that their XML applications at least meet their needs, if not exceed them. I love
it when we hear success stories. Conference papers of the genre How We Did It Good At My Place
are amongst my favorites although I know that there are at least some of you here
who say, Another case study? Uh, time to check my email.
Well, not me — I love them.
And I cringe when I hear about XML applications that fail. There are a lot of reasons
for XML applications to fail; the most common one, in my opinion, is inappropriate
expectations. The users don’t ever put it that way. What the users say is: The technology let them down — it doesn’t work.
But what that usually means is they expected magic in some form, and they didn’t
get it. And they blame XML. And in a way they’re right in that. Well, not XML per
se, but the XML community. We, the people who know descriptive markup, usually are
the ones who helped them select and/or create their XML application: Either in person
or through our publications, we helped them design or select a vocabulary, we helped
them build their cool applications, and we helped them convert their existing information
into XML. We got them started doing business in a new way, which is very, very cool.
We — we as a community and I as an individual — promote XML. I have done this many,
many times. In fact, let me share some slides from a presentation I gave in 2001.
The appendix to this paper has excerpts from a presentation I gave in February, 2001.
I’m showing it to you, not because I think it is particularly interesting or particularly
unusual, but quite to the contrary because I think it is absolutely typical of not
only presentations I gave hundreds of times but also of presentations given by many,
many other people.
This particular presentation came about because the techie people at the organization
that I am calling XYZ said, We want to move our publication process to XML, and we need somebody to come in and
tell our manager and money people what this stuff is that we want to spend money on
and why it is a good idea for us.
So, this presentation is essentially what is XML and Why Should You Care
for managers. It starts with what XML is trying to achieve: one set of data for
many publishing formats; communication of information; reuse of information; platform
independence; vendor independence; one data format, many presentation formats; get
away from the typesetting file trap; make a whole bunch of things from the same source.
It goes on to: What is XML?
It’s a data format; it’s generic markup. XML looks at things as documents; it’s
divided into elements and attributes. It’s a data format; you can make stuff that
looks like this, and stuff that looks like that, and oh, stuff that looks like this.
You can reuse your XML for print and voice synthesis and braille, and you can make
electronic things, including HTML, out of your XML.
This is a really familiar song. You’ve all sung it, right? It can and should be
generic markup, we say. We publish from XML; XML separates content from format and
behavior. It uses an output specification to get there — we call these stylesheets.
I’m not telling anybody at Balisage anything you haven’t already heard a whole lot of times. You can use and reuse your
XML; perhaps that’s the most important thing. You can reuse and re-purpose your content;
you can make subsets and spin-offs. You can do all kinds of cool things with your
XML. It is long-term, software-independent, archivable; your XML will last forever.
You can use it for workflow. You can use it for large datasets, especially from disparate
sources. You can maintain consistency. It’s easy to learn and use over the long-term.
XML is wonderful.
We tell them that it’s going to change you way you work and you’re going to have to
learn some stuff and you’re going to need a little expertise. You need training.
You need schemas. You need to make your XML part of your production process. You
probably have to convert your backfiles. The bad news, at least as I ended my typical
presentations, is there is no free lunch. Just because it’s XML doesn’t mean it’s
good. You’re going to have to do more work. The good news is that you can do XML.
There are long-term benefits; it will work for you.
There was more, of course. This is a four- or five-minute summary of a 90-minute
presentation. As I said, nothing there that we haven’t all heard and probably said
a dozen times at least. But I don’t think what I think I was saying is what my audience
heard.
I had a similar experience dealing with a technology that I know very little about.
I have in a suitcase under a bed a very expensive piece of junk. It’s a custom-made
wetsuit that I had made about 20 years ago when I was learning how to dive. Why did
I have a custom wetsuit made? Because there is a reason they don’t show Poppin’Fresh
getting into that can.
Putting on a wetsuit that doesn’t fit, especially for someone with my general physique,
is not pretty and not comfortable. So, I was measured and re-measured, and a skilled
seamstress made me a wetsuit that I could put on and take off easily and that made
me a lot more comfortable for long periods of time in the water. And they convinced
me that I would go diving more often because it was comfortable. I was told that
it would last forever because they had designed it to be adjustable. Should I change
shape, they would be able to easily modify the wetsuit; they would be able to add
to it or take away from it if my girth changed. This was just going to be just wonderful.
What didn’t they tell me? Well, first of all, apparently you have to apply stuff
on a regular basis to a wetsuit to keep it pliable, and even if you do, neoprene has
a limited lifespan. After somewhere between four and ten years, it becomes brittle
and cracks. So, now there is an expensive piece of junk in a suitcase under a bed.
Did I know to ask how to maintain it? No. And even if I did, did I know to ask,
how long, even if I maintained it, the thing would last? No. Does everybody who
works in the wet environment know all of these things? Yes. Was I being foolish
for not knowing to ask? Yes. But I was new to this.
When we who know about markup tell a story about a platform-neutral way to exchange,
share, and manipulate information, users don’t hear a platform-neutral way to exchange
the information content of your documents, but not the applications you built around
them. They hear You can move your stuff around.
Yeah.
We tell them XML provides a long-term way to store information independent of tools,
both hardware and software with their short lifespans. And this is true. But it
doesn’t occur to them that they’re spending a lot of time, energy, and money on tools
we just told them have very short lifespans. They spend a lot of money and resources
getting set up, then settle in to doing real work with XML with the expectation that
they can now focus on their subject matter.
You know how it works: They bring in a team of outside experts, and we get them started.
We do a little training, we write some documentation, we help them buy tools, and
we help customize the tools. Once we get everything working, we do a little training.
Life is good, and we go away to do the same thing for somebody else.
They keep working with their documents. And it’s working, and it’s working, and it’s
working. And then one day, it’s not. They didn’t expect that. They don’t know what
happened. And they are very, very unhappy about it.
Actually, I know exactly how they feel. I write a lot of slides — the ones in the
appendix, for example — using the Mulberry slideshow XML tool chain. We write slides
in fairly complex XML because from this XML source we can make slide decks, we can
make our handouts, and we can make exercise books for classes. There is a bunch of
stuff in the XML from which we make our slides.
A few months ago, I went to make some slides, and it didn’t work. Not only didn’t
it work, but the error message I got pointed to the last line of my input file. (You
know what it means when an error message points to the last line of your input file?
It means some tool is saying I don’t know!
The application got lost.) There is nothing helpful about an error message that
points to the last line of your input file because it’s a really good bet that is
not where the problem is.
What had happened? It’s a long story that will be familiar to you all. My favorite
photo editor put out a new version with some features I wanted. But the photo editor
required a newer version of the operating system than I was running. My ten-year-old
machine wouldn’t run the new version of the operating system. So, I got a new machine
which ran the new operating system which could run the new photo editor, but which
meant I had to get new copies of everything else that I was using. So, I needed a
new copy of my favorite XML editor, a new copy of my formatter, and new copies of
a lot of tools.
Actually, the XML editor itself worked. I could write the new slides; I just couldn’t
convert my XML into anything else because it turned out that the version of XProc
that I was expecting to use wasn’t available in the framework I was using. And the
version of XSLT in the framework I was using didn’t support the proprietary extensions
that had been in the ten-year-old version because there were functional equivalents
in newer versions of XSLT that weren’t available then; we didn’t need the proprietary
extensions because we had better ways to do it.
Sigh. I know. This wasn’t XML breaking; this was normal technological change. It
also brought me to an absolute full stop, and I was so frustrated trying to chase
it down that I actually considered writing slides in PowerPoint. I despise PowerPoint.
I know as much about this stuff as any of our users — maybe more — and it was making
me crazy trying to figure out how to deal with it. A lot of our users when they hit
that wall the first time think we broke our promises. We said their XML stuff would
work for the long-term, and it doesn’t. If they had done this in PowerPoint and they
had had to buy a new version of Office, there would have been a smooth upgrade path.
There would have been a button they could push that said Make your old stuff work in the new one,
and it would have. That doesn’t happen in XML, and it drives them crazy.
So why am I talking about this? I’m talking about this because we, as an XML community,
often complain that not enough people are using XML, and we’re indignant about it.
They should be. XML is wonderful. Why aren’t they? They must be stupid.
Well, they are probably not stupid, and they’re not using XML for reasons. The people
we want are people who are running successful projects and successful businesses,
and who do know the stuff they know. The stuff they know just isn’t necessarily the
stuff we know.
If we want the use of XML to grow — if we want XML users to be successful — we need
to address the gotcha that they’re feeling. They’re feeling that we’re letting them
down because we are. We need to make it clear that platform-neutral means they can re-invest in application
development at any time they want, but they shouldn’t think the investments that they
made in creating their environments will last forever. They think we’re telling them
that XML applications are self-maintaining. They are not. And we need to be clear
about that.
It would also be nice if we made it a little easier for them to detect what it is
that is broken when something does go wrong. I hate error messages that point to
the last line of an input file. That is the tool saying, Nah, nah, I’m not going to help
or perhaps I can’t help.
But let’s see if we can do a little bit better.
Why am I talking about this at Balisage? Why am I talking about this at the beginning of Balisage? Because I want to remind us all that what we’re talking about is important. It
is important not just to us, the people who understand markup; it is or could be,
perhaps should be, important to the world in general. We are capable of making it
so.
I want to start a discussion of what we can do to nurture the understanding of the
concepts that we at Balisage find clear, important, and in many cases, obvious; but that the rest of the world
does not. They’re not ignoring us because they understand it and dismiss us; they’re
ignoring us because they have no idea what it is we’re talking about. We fill our
language with jargon, and we, all too often, skip over stuff that we think is obvious
that they don’t know.
We, as a community, want XML to thrive and to grow. It is not, at least, not in the
ways we want. Part of that is because nothing solves all problems, nothing appeals
to everyone, and truthfully, there are situations in which there are more appropriate
approaches. It would be good if we admitted that. And there are users who will always
go with the newest, shiny thing; and we aren’t the newest, shiny thing anymore. Declarative
markup, and XML in specific, is something that just works.
But we’re also letting our community down by not communicating as well as we could.
We can get better at helping people set reasonable expectations and navigate the process
of changing our tools. Also, I encourage tool suppliers in particular, but also consultants,
to try harder to make this comprehensible to user people, to subject matter experts.
The next time one of us is inspired to jump on a horse and charge into some hapless
project and move them from some inappropriate technology (Are they really storing all of their texts in Excel?
) to a much more appropriate technology (This belongs in XML; this is long-term important.
), you’ll stop and think: Are you moving them from something they know how to use
and know how to maintain to something completely foreign to them? Are you going to
get them there and leave them in the lurch when you leave? Don’t just get them started;
make a long-term plan for sustainability not just of their XML documents but of the
XML ecosystem that you are helping them set up. If they want declarative markup for
long-term stability of their content, set them up for long-term success. You are
a false hero if you set them up with a short-term, shiny toy that they can’t maintain.
I want us to stop talking about XML as a document format and start talking and thinking
about it as part of an environment. Fortunately, we have a talks at Balisage that help us think about why this is important and how we might start making those
changes.
Appendix A. Excerpts of Introduction to XML for XYZ
(February, 2001)