Here at Balisage we talk about markup, most often declarative markup. We talk about XML and
use of what we call the XML stack.
We talk about how powerful explicit markup is, and we whine
about people who should
be using it but aren’t. We lament the lack of respect we get and moan about
not understanding why XML hasn’t taken over the world of data interchange.
In my opinion, the XML market has been moderately successful selling a package that includes: some philosophies (declarative and generic markup), a syntax, some programming languages (XSLT, iXML), some associated specifications, and some tools. We sell the proposition that there are significant advantages to creating, managing, and deploying content our way, and if users cannot create content in generic markup (XML) they should up-convert their content as soon as possible. This way they will be able to do multiple things with it, use many tools, be vendor independent, and we hint that they may find that their documents can suddenly play the piano and tap dance. We tell then that they should use explicit, generic markup. AND they must, we tell them, use pointy brackets. AND they must leave Perl and Python in the dust and commit to XSLT (or iXML and XSLT). AND all of their code must be declarative and side-effect free. We tell them that their documents are trash, the programs they have worked for years to master are useless, and that they are once-again beginners. We push this as an all or nothing proposition.
Not only is this approach arrogant and off-putting, it is wrong. There is no essential link between generic markup and pointy brackets. Similarly, the power of declarative markup is distinct from the power of descriptive markup, both of which are significant. XSLT and much of the rest of the XML tool stack CAN be used in other environments. I think it is time we stopped insulting our would-be users, customers, colleagues, and their (often highly successful) documents and environments. I think it is time we unbundled this package and helped people use the parts that work for them in their contexts.
Bundling is a tried and true marketing technique, developed for the convenience and
profit of
the seller. It is, often, an irritation to the /customer/buyer/user …. I’m talking
about bundling and selling things in bundles rather than individual components. And
I started putting this talk together, and I said, Okay, I think I know what bundling is.
What is bundling? And I went to that font of all knowledge about popular culture and common wisdom, that being Wikipedia. And Wikipedia starts its article about product bundling with:
In marketing, product bundling is offering several products or services for sale as one combined product or service package. It is a common feature in many imperfectly competitive product and service markets.[1] Industries engaged in the practice include telecommunication services, financial services, health care, information, and consumer electronics. A software bundle might include a word processor, spreadsheet, and presentation program into a single office suite. The cable television industry often bundles many TV and movie channels into a single tier or package. The fast food industry combines separate food items in a
meal dealorvalue meal.
I recently encountered an irritating example of bundling. A friend mentioned that
their
child’s school was using cardboard microscopes which were wonderful because they allowed
each student to have their own microscope and have enough time with it. And students
who might be considered too young or too irresponsible to use expensive equipment
were given these cardboard microscopes because, if kids did what they do, well, you
know, they were cardboard. And I said, You know, I might enjoy playing with that. I don’t have a microscope — and I love
toys — so what is this thing called?
They’re Foldscopes.
So I looked up Foldscopes. I learned they have a paper frame; they’re available
in two versions; and they have 50, 140, and 340x lenses. Sounds good. How much will
the basic frame and a couple of lenses cost me?
That’s not easy. There are kits. The basic Classroom Kit costs $50 for 20 of them with the 140x lens. Oh, I could start with one of those and get other lenses if I like this thing. But the basic Classroom Kit has 20 of them, and there’s a bunch of other stuff in the kit that I really don’t want.
That’s irritating. There’s just one of me. I’m not even sure if I want this thing. I want to poke at it; I want to see if I like it. I can buy the bigger Classroom Kit with 50 of them, or I can buy one with, I think, 500 of them in it. I can buy kits that only have one lens, and if I want more lenses, I can buy 20 each of the additional lenses. I don’t want 20 of them — I want one.
Apparently, my only purchase option is the Explorer Kit, and there’s a lot of stuff in there I don’t want. There are slides and pre-made slides and little cards you can only read with a microscope (teeny tiny print). Well, actually that’s kind of cool, and I might want to look at that. And a fancy box and tweezers. I have tweezers. I don’t need tweezers from them. We’ll come back to this, but it was irritating. I couldn’t just buy the bits I wanted.
We’ve been selling XML in a bundle too. We’ve been selling XML — and previously SGML — as a bundle for so long and so earnestly that when I learned it, I believed it and preached it that way myself for years. I learned SGML as a bundle, and XML did nothing to change that story. The bundle included explicit inline markup, generic and descriptive markup, explicit grammar rules, validation, and an extensive and ever-improving toolset. And that was one package.
It’s more detailed than that. The package included explicit markup in one particular syntax. Well, XML did. SGML didn’t — SGML had a thing called the SGML Declaration which was really complicated and really cool. The SGML Declaration let you change a whole lot of things in the syntax, for example, what the markup delimiters were. You didn’t have to use pointy brackets; you could use curly brackets, or smiley and frowny faces, or anything you darn pleased.
I remember some people who did some enormously clever things with one document that had one set of markup and one set of meanings with one SGML Declaration — it was a technical document — and when you swapped out the SGML Declaration, what was markup changed, and it became a document about how to write that kind of technical document. It was a challenge to write; it was probably a challenge to maintain, but it really worked. Actually, these days you could probably do something similar with iXML. I haven’t quite worked all the details out, but that would probably work. That was explicit markup. It was in the document; it was visible.
That was one of the packages of the rules. Another package of the rules was generic
descriptive markup. You had to have generic markup; it was better. It is
better! We have been told, I have been told, and I have said
many, many times that it is better to identify a part of text as a
<title>
than to identify it as <italic>
. So, today I ask:
better in what way? Is it better generic markup? Yeah, it’s more generic … if
you’re correct and if you know that, <title>
is probably better generic
markup. What if you’re in an environment in which there will be many titles, and
some of
them will be displayed in italic, and some of them will be displayed in another way
for some
good reason or for no good reason at all except that that’s the way
the person who wrote it 700 years ago — or 200 years ago, or last week — wrote it,
and you can’t interview them and ask them why. You don’t know why! And the person
who wrote it is long gone or unavailable.
And is it better — or even more specifically, better generic markup — to identify
something as <emphasis type="italic">
than to tag it as <italic>
? I’ve certainly been told that many times. I’m not sure I buy it. If this is the
only emphasis type you have in this environment? What about if you have people creating
content who are impatient with the extra level of indirection and your tool doesn’t
completely hide the tagging from them? If you can accurately and reversibly convert
one form to the other, why aren’t we using the one that people find most convenient
at any particular time? Why are we being prissy about insisting on a bulkier format
for the same concept?
Ooh, I’m calling people names. I’m saying we’re being prissy. You know what? I
believe it. And in any case, that makes it better generic markup, not better XML. The XML specification (and all the associated specifications of which there are
many, many, many, many) — remember when we were so proud of the XML specification that was a teeny little
pamphlet? Yeah, well, it’s not a teeny little pamphlet any more; it’s a bookshelf
— says nothing about generic markup. Some of the examples — maybe most of the examples
— are generic-ish.
Note I said -ish.
But that’s not XML. What’s generic markup about XML and SGML is the history of SGML. It’s the group of people who invented it; their goal, initially some of you may recall, in the GenCode Committee was to make a list of all the tags anybody was ever going to need. And when they came to the conclusion that they couldn’t do that because that was making a list of everything anybody would ever be interested in, they instead came up with a way to identify the tags you’re going to use this time. But the specification is about how to handle the tags, not about what to tag.
I’m not saying I don’t like generic markup. I like generic markup a lot. When I
used a word processor back in the bad, old days when you really didn’t want to look
at
those files — you really, really didn’t — I generally invested a significant
amount of time creating a set of styles for my document or document collection, and
those styles
were generally things like recommendation,
applicability,
context,
pros,
and cons,
although there were also
styles that said things like footer.
For my mind, because I have generic tagging
bias trained into me after years, those tags were more tractable and easier to use
and to use
consisently than using the bold and bold-underline tags that came built-in with the
word processor. So I was doing generic markup in WordPerfect — I might have been
doing it in WordStar. I’m not sure. I’m old.
But-but-but-but-but you can’t validate that, and validation is important.
Yeah, well, first of all, actually you can; there are a whole lot of tools — not necessarily XML tools, or not necessarily tools
that start in XML but that turn things into XML — that make it possible to validate
that stuff. These days that might mean iXML — which we’re going to hear a lot about
this week — but there have been tools that read various word processors and checked
them against a rule set for a really long time. They’re not our tools. That doesn’t
mean they don’t exist.
Validation is a great and glorious thing. It can significantly reduce the amount of time needed to tidy a document collection in order to use it, significantly reduce the amount of error handling code needed in any application that receives these documents, and prevent content creators from making errors they don’t want to make. Overzealous validation can also make content creators angry, hateful, and itchy, and possibly sabotage your systems. I’ve seen more than one really cool new authoring system brought to its knees because it was forcing authors to adhere to rules that either didn’t apply or didn’t apply at that time. The tools, the validation that was provided for the authors — or they would say imposed on them — was harmful.
We (the XML fanciers) did not invent data validation, we don’t have the only validators, and our validators are not restricted to validating generic markup. But (we say, and we probably believe) OUR validation is better because it is based on explicit grammar rules! We have DTDs, and XSDs, and RNGs. And most of the tools that read these grammars and check them against documents agree on what the rules mean and on what documents do and don’t follow the rules. Yes! This is a good thing. Encouraging people to make explicit validation criteria and to check their documents against them is helpful. My point is that in order to take advantage of the benefits of validation we are not restricted to generic markup or pointy-bracket syntax. What? Pay attention to iXML! My point is that in order to take advantage of the benefits of validation we are not restricted to generic markup or to pointy-brackets syntax. You don’t have to have the whole package.
What about the XML stack? Wow, it’s amazing. We have this large and ever-growing,
constantly-improving set of tools for the creation, management, manipulation, and
display of XML documents. It’s amazing. It’s impressive. It’s powerful. I don’t
want to imply that it isn’t. It is, and I like it. But I’m not aware of any users
who use everything in the XML stack. I am aware of users who do not have and do not want XML, but who use parts of the XML stack. And I’m aware of more users who probably
would use some of our cool tools if we didn’t put so many conditions on their use.
You should,
we say to our users, change the way you think about your documents; you should change the way you encode your documents. You should change all of the tools you use. You must realize that, although you think you are really successful in looking for ways to
improve on-going, thriving publishing operations, you’re doing everything wrong.
Insulting your would-be customers is an odd sort of a way to make friends — or sales.
Some of you know who I’m talking about. Once upon a time, there was a tool vendor who had what was, at the time, clearly the best tool in its class. It was years ahead of the competition. It was beautiful, and with it, users could produce publications that were more beautiful, more powerful, more impressive, and more graceful than using any other tools of the time. So why don’t you all know what tool I’m talking about? The main voice for this tool — the head sales person — made it a practice to tell would-be customers that their current products were … terrible. Well, actually he was a lot more graphic than that, and he generally did it in public, like at trade shows. He told people in public that the products they worked hard to create and that were large revenue-generating services for their organizations and that provided the money that they could use to buy this new tool were worthless and would remain so unless they used his product. These were successful businesses with successful products. They generally did not take kindly to being insulted in public, and although that tool would have been useful to them, they decided not to buy it, and I can’t blame them.
The people who would no doubt benefit from adopting some of our approaches, techniques, and tools are not enthusiastic about being told that they know nothing. They are not really receptive to strangers walking in and telling them they’re doing everything wrong. And they shouldn’t be. They are not doing everything wrong.
Let’s get less historical and a little closer to home. We in this virtual conference
room are an astonishingly arrogant group. I see it putting together the proceedings
for this little conference. These proceedings, a very small annual publication, should (should
has become my current least favorite word) — these proceedings should be easy to assemble: we ask for papers in XML, use a bunch of XSLT to make HTML,
and put it on the web.
Yeah, it’s a little more complicated than that. We make anonymized versions of submissions for peer review. We add some metadata, such as index terms, and we make one version of the HTML for regular display and another one that is simpler for screen readers, small devices, and Google Scholar™. We extract files from the XML to send to Crossref to register the DOIs. We make another version to send to Portico for their dark archive. Hey, that’s what we do — our internal processes.
Let’s talk about our authors. We have very smart XML people writing Balisage papers. We regularly hear that they glanced at our tag set and saw that it resembled
DocBook, so they used the version of DocBook that shipped with their favorite editor
and assumed it would be okay. A few of them have been adding structures to DocBook
to enable them to encode something the way they prefer, and they sent the file in
… without mentioning the creative tagging. One was very annoyed that this improvement
was not gratefully accepted. Several write in their vocabulary of habit and convert
it to ours, but they don’t, for example, validate. They convert the 80% that is easy
to convert with global changes (using XSLT, of course), and leave the rest of it to
us to fix. Every year, I have several conversations that can best be summarized as:
We can’t handle the XXX in your paper; will you please replace it with something we
have described in the documentation on the conference website?
I don’t need to replace it; it works like this (long explanation).
It works like this? Friend, it may work like that in your system, and it may be that you think it should work like that in mine. But if I tell you it doesn’t work in my existing system,
your choices are: use a structure we already support, offer to update and test the
entire conference proceedings infrastructure that has been growing and accumulating
cruft for over 20 years, or we leave your paper out of the proceedings. We are not
going to add features every time an author tells us we should have it.
It could be worse. I once worked with a publisher of technical computery journals who published instructions for authors that specified, among other things, which word processor formats were acceptable for electronic submission of manuscripts. They frequently — as in many more than once — received “better” word processors, written by the author, and the manuscript in the internal format of that new word processor. These authors genuinely expected the publisher to install this word processor and to publish the manuscript using it. Guess again!.
I have yet to attend an XML-related event where — I started to say someone, but it’s usually several people — people didn’t whine about how slow adoption is and how underappreciated XML is and how underappreciated we are. And why don’t people just see the light and do it the way they should? And it’s not that we can’t sell the whole generic pointy-bracket XML stack package. We can. We have. XML is successful. XML use is growing. But not as much as we would like it to, and people aren’t stamping out hero badges for us. The XML market is moderately successful selling a package that includes some philosophies (declarative markup and generic markup), a syntax, some programming languages, some associated specifications, and some tools.
We tell would-be users that there are significant advantages to creating, managing, and deploying their content our way. And if they can’t do that, they should up-convert their content as soon as possible. This way, they’ll be able to do multiple things with it, use many tools, be vendor independent, and they may find their documents can suddenly play the piano and tap dance (or at least we kinda imply that). And they must leave Perl and Python in the dust and commit to XSLT (or iXML and XSLT). And all their code must be side-effect free. We tell them their documents are trash, the programs they have worked for years to master are useless, and they are once again beginners. And we push this as an all or nothing proposition.
Not only is this approach arrogant and off-putting, it is just plain wrong. And we’re beginning to see this. There is no essential link between generic markup and pointy brackets. The power of declarative markup is distinct from descriptive markup. XSLT and much of the rest of the XML tool stack can be used in other environments. It’s time we stopped insulting our would-be users, customers, colleagues, and their often-highly-successful-without-us documents and environments. It’s time we unbundle this package and help people use the parts that work for them in their contexts.
I started talking about the Foldscope and the fact that I couldn’t just buy the parts I wanted. Well, I’m a sucker for toys, so here is the fancy box (holding up a box). Do I need a fancy metal box for a little paper microscope that I wanted to try? No, I don’t need a fancy metal box, but I have one. And in the fancy metal box, I have the paper microscope, which is a pretty cool idea. I have only spent about 15 minutes with it, and I couldn’t actually get it to work. I am confident that I will get it to work (if my friend’s nine-year old can do it, I can do it … I think.) I have a user guide which will help; I have a welcome letter which won’t help. I have some pre-made slides, including a fern rhizome, so I can make sure that I have a properly mounted thing that I can use with my microscope. I have a light I haven’t figured out how to use and a couple more lenses. And a crummy pair of plastic tweezers and, of course, the embroidered patch. Boy, do I need that.
Let’s think this week at Balisage, as we are talking about the things we are doing with markup, the ways we are doing it, and the tools we are using, about how much of what we are talking about is the fundamental thing we want and how much of it is embroidered patches. Welcome to Balisage.
References
[Wikipedia, “Product bundling”] Wikipedia contributors. Product bundling.
Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/wiki/Product_bundling (accessed August 22, 2024). (Text is available under the Creative Commons Attribution-ShareAlike License 4.0.)
[1] Adams, W.; Yellen, J. (August 1976). Commodity bundling and the burden of monopoly.
Quarterly Journal of Economics. 90 (3): 475–498. doi:10.2307/1886045. JSTOR 1886045.