Murray Maloney just called my attention to a recently case horoscope which covers the next couple of hours. It seems to be describing at least the title of my talk, Things stay the same, but is perhaps describing the conference as a whole. It says:

Attempts to change things fail. Others resist your suggestions and stand in your way.

Comforts are a bother.

I wasn’t quite sure at first what to make of that last bit, but now I think it’s telling us there’s no conference lunch today.

As the horoscope suggests, as people — as human beings — we seem to have trouble with change.

At one level, we run into difficulties when the meanings of technical terms shift. That’s part of language history, but it leads to complications, as Tommie Usdin told us in her opening on Tuesday Usdin. It can lead to resentment if people re-use old terms with new meanings; it can lead to resentment if they coin new terms when we think they could have used old ones or the terms that we coined before they coined theirs.

That low-level problem is matched by similar problems at higher levels of abstraction or system complexity. We’re all familiar with the problems of building systems to accommodate change and with the challenges that come with trying to accommodate the changes that actually happen even after you have built a system to accommodate change because you discover that you built it to accommodate this kind of change [gesture], and if people would only make that kind of change instead of that other kind of change [gesture], our system would be fine. But people don’t always change in the way that we want them to. Keeping track of things is complicated. Keeping track of how things got the way they are is an important topic; without that, we have no memories. Ashley Clark’s paper on meta-stylesheets and the provenance of XSL transformations addresses a key problem for every process undertaken by any institution that hopes to have any form of institutional memory Clark.

The difficulty of change isn’t peculiar to technical areas.

The challenge of dealing with change has been with us for a long time. We have a long history of efforts to deal with change. Centuries of stoic and neo-stoic philosophy teach us to discipline our emotions and not to allow ourselves to come to care for things that may change.

The neo-stoic approach somehow feels like an attempt to deal with the emotional side — the emotional difficulties — of change. One of the reasons we are uncomfortable with change is it reminds us that time is passing. Not just time in general, but our time is passing; our mortality is nearing. Many of us have some difficulty, at least some of the time, contemplating our mortality with complete equanimity. But even in less emotional contexts, change and time are hard. A lot of us here — a lot of people in IT generally — began in other fields and got into IT through specific application domains, so I can’t be the only person here who has trained in another field but occasionally reads computer science textbooks. And so I doubt that I’m the only one who has been struck by the contortions that formal descriptions of Turing machines go through to explain the change of state in a Turing machine or even a finite state automaton. A completely informal introduction to Turing machines has no problem with this topic at all, but any skilled mathematician trying to provide a more careful description will bend over backwards to avoid talking about time or change of any kind. They will begin with an elaborate description, completely static, of the state of a Turing machine. They will then develop an elaborate rule for comparing two Turing machine states so as to determine where they differ and where they resemble each other. And finally they will describe the construction of sequences of Turing machine states, each adjacent pair of states related by a similarity relation based on the rule for comparison, to ensure that any two adjacent states are similar in all but exactly one way. They may spend pages and pages — or worse only half a page, in compressed and laconic form, with no explanation, motivation, or guidance — all in order to avoid saying The machine changes state. Why? Because our formal systems just don’t deal well with change.[1] I submit to you that at least some of the difficulties Claus Huitfeldt reported on, some of the objections philosophers have raised against viewing documents as timed abstract objects (or against timed abstract objects more generally) stem from to this weakness in our formal systems: their inability to accept change Huitfeldt, Vitali and Peroni.

Unfortunately, the difference between change and stability is one we cannot get away with ignoring, because the difference between change and stability, and the relation between them, are in some sense at the heart of the idea of descriptive markup. Years ago, computer scientists at the University of Waterloo wrote a paper that was distributed in manuscript — distributed fairly widely but never published in this form — that said among other things the critique of SGML and the critique in particular of the ontological pretensions of SGML theorists. And in that draft, the author, Darrell Raymond said:

Yes, descriptive markup rescues authors from the frying pan of typography only to hurl them headlong into the hellfire of ontology.

Raymond, Tompa and Wood

We might well ask how that happened. Why did ontology suddenly break out as an area of concern in the slightly unexpected province of technical publication? And I would like to suggest a possible answer to you.[2] Put yourself in the position of the tech publishing managers who were struggling to develop the concepts which we now know as descriptive markup and which became SGML — generic coding and then SGML and then XML. They are perfectly aware that when they print manuals this week or this month for this year’s release of the operating system, chapter titles will be in a particular font-face — they will be Garamond 16 point on 18 demi-bold with a 7 quad vertical jump and a horizontal rule and so forth. But they are also acutely aware that that’s not essential. That that’s not a permanent part of the document. They’re acutely aware because they are involved in the design decisions, and they know perfectly well that some people really hate Garamond, and it may change. And they are also acutely aware that changing that kind of thing manually is extremely expensive, so they want a way to encode the document that won’t require manual change every time the style changes because style is part of their professional identity. They know perfectly well that style changes. They know perfectly well that when the big new version of the operating system comes out eighteen months from now, they’re going to have a ground up redesign of the entire technical library. There will be a new look; they don’t what it’s going to be because the contact hasn’t been issued yet, but etc., etc.

So we want a way to represent a document that will work for the styles we’ve got now and for the styles we’re going to have in eighteen months with the new big release. How do we do that? Well, we need to find some aspect of the document to call out that can be related to the styling and that’s not going to change. The style is going to change, and the kind of processing we’re going to do with the document may change. (In the early 80’s not everybody had a well-developed expectation of being able to do full-text searching and build things other than printed pages from documents, but a lot of people knew it was possible even if they didn’t have software to do it themselves, and they had fond hopes.) The one thing that’s not going to change between now and eighteen months from now is the structure of the sentence: we want it to be 16 on 18 Garamond demi-bold because it’s a chapter title. Everything in that sentence is going to change except We want it to be [styled in a certain way] because it’s a chapter title. Our ideas about what we think this part of the document is are going to change more slowly, maybe not at all. But at least they’re going to change more slowly than what we do with this piece of the document. It is natural, then that any effort to produce reusable document representations is going to push us a little bit toward ontology, toward a statement of what the thing is and not what we want to do with it.

Now, it’s possible to over-emphasize the ontological pretensions of descriptive markup, and I’m sure that many of us who got enthusiastic about SGML in the 1980’s did so. But there are philosophers who tell us almost no statements are really permanent in the way I have just described. The American philosopher W.V.O. Quine spent much of his career attacking what is called the analytic/synthetic distinction. That is the common philosophical doctrine that some sentences are true by virtue of the definition of the terms Quine. For example, consider the sentences A bachelor is an unmarried man, or a spinster is an unmarried woman. The standard analysis in terms of the analytic/synthetic distinction says that these sentences say nothing at all abut the real world, and so no empirical observation of the real world can possibly render them true or false. They are not about the real world; they are about the definitions of terms in our language. Sentences like these are true or false because of the meanings of their words and not because of any state of affairs in the real world. By contrast, the sentence Joe is a bachelor is an empirical or synthetic statement and is true or false depending not just on the meanings of the words Joe and is and bachelor but on whether in fact Joe is or is not married. Quine said, No, no, no, no. There’s no such bright line between analytic statements and synthetic statements; there is only a spectrum between sentences whose truth values we are ready to revise at a moment’s notice, and sentences whose truth values we are not so ready to revise. Our beliefs about the world, Quine argued, do not face the tribunal of experience singly, but as a corporate body. This is certainly true in every formalization of logic I’ve ever seen; if you have a contradiction in a set of statements, you don’t know which statement to remove in order to remove the contradiction, and quite often you have a choice. Just take the simple case: P and not P. Well, we can make that set consistent by dropping not P, or we can make it consistent by dropping P. There’s no a priori distinction.

Quine said what we call the truths of logic are just the sentences that we are going to throw to the wolves last. If, one fine morning in the physics lab, the data from our physical measurements don't match the expected values, we are prepared to accept the proposition that we must have misread the dial on the instrument. We resolve to be more careful. If it happens again the next day and the next, however, then it becomes harder and harder to believe that we have misread the dial so consistently, and we begin to think the machine may be out of calibration. And if the evidence mounts up, we may be willing to consider the possibility that the machine is actually not measuring what we thought it was measuring, it’s broken completely, or it’s based on a wrong physical theory, and it never measured anything. Those are progressively harder and harder to accept because they involve the revision of more and more of our world view, but on Quine's account nothing is sacred, and no part of our belief system is immune to revision. Perhaps that’s the level of ontological commitment that we can safely make for good descriptive markup: It’s not sacred, but it's one of the parts of our worldview we are least likely to change in the normal course of events. It’s useful to say Call things what they are, but it’s also easy to tie ourselves in knots over the philosophical questions of what things really are and what things do we believe really exist. You can build a lot of good practical systems by dialing that back a little bit and asking instead What do we think we’re going to think this is, over the course of the next five years?

Steven Pemberton showed us the other day that such conceptual change is not unique to documents. The same problem arises in other areas. He told us, remember, about an abstraction error in early C and Unix — the conflation of the notion of character with the fundamental unit of storage Pemberton. Why is that an abstraction error? It’s an abstraction error because our concept of character changes at a different rate from our implementation of character representation. Implementations, we know, are going to change a lot, and our concepts change much more slowly. It’s not that our concepts never change, but they change much more slowly. It’s an abstraction error to conflate things that have different rates of change; ultimately, they shear apart, giving us the technological or conceptual equivalent of Africa on the one side and South America on the other, and like continents our concepts can drift pretty far apart as time goes by.

What won’t change in the short term are the things we care about. And what some of us care about may not be what other people care about. That’s one of the reasons we want user-definable vocabularies. Sometimes those of us who drank the descriptive markup Kool-aid in the 1980’s may find it extremely counterintuitive to see that what some people care about is not the logical structure of the document but the design. Dianne Kennedy talked to us early in the conference about the PRISM Source Vocabulary Kennedy, which demonstrates conclusively that a sufficiently general idea like descriptive markup can be applied in ways that go well beyond what the original designers may have had in mind.

Some papers at this conference provide striking illustrations of just how far we’ve come as a community towards being able to build whole systems based on standards and descriptive markup, like Anne Brüggemann-Klein’s paper yesterday about leveraging XML technology for web applications Brüggemann-Klein, Hahn and Sayih.

Even after we relativize the idea in the way I’m suggesting, any emphasis on saying what things are and on data longevity will lead directly to a consequent emphasis on getting things right in the first place. We have to get this right because these documents are going to be around for a long time. We’re constructing documents and document systems for longevity, and if we succeed we are going to have to live with what we say for a long time. Quality assurance takes on even more importance in this context than it might otherwise do. This year's program has had a remarkable emphasis on quality assurance — not just in the pre-conference symposium on Monday about quality assurance, but also throughout the conference proper. On Monday morning, Dale Waldt gave a general overview of the issues, stressing among other things that you don’t do quality assurance last if you want to have good quality assurance; you need to push it up-stream as far as possible Waldt.

Wei Zhao and Jeff Beck and their colleagues provided wonderful descriptions of QA practices in large aggregators like the Ontario Scholars Portal or PubMed Central. Such aggregators are constrained to accept input from a wide variety of sources with wide variations in quality, but they try to produce unified interfaces. So the aggregators need to try to level the quality by bringing up the lower bound Zhao, Chengan and Bai, Kelly and Beck.

Keith Rose and Tamara Stoker provided an inspiring view of what can be done inside a publishing organization when it sets its mind to improve the quality of its data Stoker and Rose. They were talking about the American Chemical Society, but a lot of what they said could be applied anywhere.

Also on Monday, Charlie Halpern-Hamu gave us a wonderful synoptic overview of a whole sampler of techniques for quality assurance in document projects. Halpern-Hamu. At least one person was overheard leaving the room saying, Well, that changes my schedule for next week; that project is going to be redesigned because those techniques will work better for us thatn what we are doing now.

If you want to take quality seriously, you need quality assurance not just on your data but on other parts of your system. Eric van der Vlist spoke to us Monday morning about applying quality assurance not to documents, but using documents to apply quality assurance to the schemas that we use to validate the documents van der Vlist.

Sheila Morrissey and her colleagues at Portico showed how they extend their focus from the data that they are preserving to the systems they are using to preserve the data Morrissey et al., and Jorge Williams, in the conference itself, pointed our tools in yet another direction, to validate not deployment packages, but whole RESTful services Williams and Cramer.

historically, one of the most important tools in quality assurance for descriptive markup has been the document grammar and validation against a schema. The great promise of syntactic validation both in SGML DTDs and before it in things like the invention of BNF in the definition of Algol 60 is that it enables the automatic detection of certain classes of errors. It’s easy to forget that careful people never promised that automatic detection based on clean syntax definitions would detect all errors. The value proposition has always been: when a large class of errors can be detected automatically, it becomes possible to concentrate expensive, human eyeball resources on the class of errors that cannot be detected automatically. It’s easy to think Oh, gosh, if validation doesn’t catch all errors, so you still have to have eyeballs, then surely automatic validation is pointless. I think if experience shows that it’s better to find things automatically if possible, because it’s cheaper, but that no organization that cares about data quality can plan to do without human eyeballs entirely. And, of course, if you don’t actually plan to have any humans looking at your data, ever, why are you working with it in the first place?

When thinking about automatic validation, it’s always tempting to go for more power, but like all error detection, validation involves a trade-off between power or completeness of checking and cost. More power in a validation language is not always an improvement, because the more powerful the validation language is, the harder it is going to be to reason about the class of documents accepted as valid. Turing completeness is not necessarily a recommendation in a validation language, but it’s always interesting to see just how far you can go in validation while keeping things tractable. Jakub Malý’s talk about applying OCL constraints to documents by means of translation into Schematron shows an interesting approach in this area Malý and Nečaský.

Sometimes the challenge is the complexity of the validity function that you’re calculating, and sometimes the complexity lies in figuring out just what the validation function is and where we are expected to be drawing the line between okay input and not okay input. And sometimes the challenge is figuring out which of those is the problem that we face: is it hard because we have a complex validation function or because we can’t figure out what the validation function is, and how do we tell when we have solved that problem? Those not working in healthcare informatice might not be able to relate to or follow all the details in what Kate Hamilton and Lauren Wood were telling us the other day Hamilton and Wood, but almost everyone can look at the complexity of the situation they described and suddenly feel better about the degree of complexity in the problems we face in our own work.

As soon as we decide we want to check things not just for structural correctness but for veracity or at least verisimilitude, we find ourselves skating onto the sometimes thin ice of semantics and ontologies. As Kurt Cagle described in his paper, we will need mechanisms for managing controlled vocabularies and developing semantics Cagle.

Predefinition of ontologies is not the only way to achieve better semantic control of our data. Sometimes we can work bottom up; Steve DeRose’s talk on text analytics on Monday suggested a lot of opportunities for plausibility checking using probabilistic, or stochastic and non-symbolic methods. DeRose. I’m always nervous about purely stochastic methods because I’m never sure I understand what it is they’re telling me, and I’m always afraid I’m going to make a fool of myself by assuming that they’ve passed the Turing test when actually I’m just talking to a modern version of Eliza. But text analytics may help us with plausibility checking.

If you’re in the realm of stochastic methods, then you’re in the realm of sampling and exploratory data analysis and the techniques that Micah Dubinko talked about, using XQuery to get an overview of unfamiliar data Dubinko. Someone hands you a USB stick and says, I need a summary. Now. Quick! What do you do? Well, hopefully you’ve internalized the techniques Micah Dubink described, because they will help you in that situation.

Charlie Halpern-Hamu gave a very illuminating talk on a way of constructing a sample to get essentially similar kinds of results but in the form of a particular test document that you can use so that you have a small sample to develop against, but it’s a good sample to develop against because it exercises as much of your code as can conveniently be managed, a lot more of your code than you are otherwise going to get in a single sample Halpern-Hamu.

The topic of testing reminds us of the ongoing, never-ending discussion between those who wish to prove software correctness by testing (which cannot be done because tests can never prove the absence of bugs; they can only prove the presence of bugs) and those who would like to prove things correct by reasoning. You know, I’m always more comfortable with code if it can be proven correct, but I’m also uncomfortably aware that the literature is littered with articles about people who took code that had been proven correct and translated it to running code and ran test cases against it. And particularly, the usual way to bring a program like that to its knees is to hand it input that doesn’t obey the contract you made. Yes, of course, Dijkstra’s algorithm for calculating the greatest common denominator of two integers is going to fail if you hand it two inputs that are not integers or not integers within the acceptable range. But, you know, that’s what our users do: they give us stuff that’s out of range. That’s one of the reasons why I got interested in validation to begin with.

Sometimes, as Michel Biezunski explained to us yesterday, having valid input and valid output is not enough because there is plenty of valid data that the software doesn’t actually support. And if you want to publish e-books, take a deep breath and be prepared for the fact that you’re going to have to test e-book readers one by one Biezunski. That’s a rude awakening for some of us, but it confronts us with the real world.

Sometimes the confrontation with the real world is not quite as depressing as it was during parts of Michel’s talk. I was very heartened by Liam Quin’s discovery that the percentage of ill-formed XML is not nearly as high as some people have suggested Quin. That made me feel better, but it’s still a reality check. 13%? Gee, I would have hoped for better than that, even in RSS.

And Betty Harvey’s talk — talk about rubber meeting the road! I hope that the specs we’re involved in writing now are as functional and implementable twenty and twenty-five years and thirty years from now as the specs that she showed us implementing today Harvey. Yes, they look kinda dated and quaint, but, by golly, they do still work. By golly, maybe all the work done in the 1980s for future-proofing data did effectively future-proof some data. That’s a really heartening thought; I’m very grateful to Betty Harvey for bringing us that message.

Whenever you have longevity — and we should think about this if we’re aiming at longevity — you have maintenance issues. I’ve never done research myself, but I have been told many times and its seems plausible — it seems consistent with everything I’ve ever seen in real life organizations — that the main cost of maintenance is not fixing errors, even though fixing errors after a program is deployed is very expensive. The main cost of maintenance is adjusting the program to run in new environments.

Now, if my thumbnail sketch of history holds any water, one of the points of SGML and XML was to help information longevity by recording the properties of the information that don’t change or don’t change as fast as our processing needs. So, we might expect that SGML and XML themselves by focusing attention on what doesn’t change may themselves have a long life. But what we really care about is not, in the last analysis, a long life for our technology, but a long life for our information, and that may involve changing the technology as we go along. Several people have mentioned in other contexts the example of our grandfather’s ax or Theseus’ ship; is it still the same ship after we have replaced each plank? Is it the same ax if it's had three new handles and seven new heads? In a passage popular among philosophers, W.V.O. Quine describes with approval a suggestion of the philosopher Otto Neurath. Quine says:

Neurath has likened science to a boat which, if we are to rebuild it, we must rebuild plank by plank while staying afloat in it.

Quine

I haven’t thought of a better analogy for the situation of our technology adopters, or a better explanation for their aversion to change and risk. Many people have observed that users of technology (as opposed to technologists and evangelists, like many of us in this room) are famously relunctant to adopt new technology even if it is clearly better. Why? Partly because change is painful, and partly because change involves risk. They’re at sea in a boat, and we’re suggesting that they take out a plank and replace it with a better plank. That’s going to be risky. But those of us who are afloat in SGML or XML are in that situation; if we want the ship to stay afloat, we are going to have to replace some planks, or at least that’s a possibility we are going to have to face.

One way for technology to grow, of course, in one dimension is to become smaller in another dimension. So, we must occasionally ask ourselves which dimensions we care about and which aspects of our technology we want to preserve, and which we’re willing to jettison. The logic of John Cowan’s MicroXML is that by making the syntax and the spec of XML smaller, we can appeal to a larger audience; that’s a trade-off we have to consider Cowan.

The more traditional way to grow a technology is to add functionality, and there the challenge is to add functionality in a way that feels like an organic development, feels like growth, and doesn’t feel to the technology adopters like a risky change. The organic development of XQuery in Mary Holstege’s talk on type introspection Holstege exemplifies the kind of new functionality that will feel intuitively right to most users of the technology. I was similarly heartened to hear Abel Braaksma’s discussion of higher-order functions and other functional technologies in XSLT 3.0 Braaksma. Hervé Ruellan’s discussion of XML entropy seems to suggest ways to develop and evaluate compression mechanisms for marked-up documents Ruellan.

Sometimes the way to find new ways to do things in language is not just to add new things, but to push what you’ve got a little harder than some of us would otherwise have pushed it. That’s the lesson I took from Wendell Piez’s talk on using XSLT to parse not XML, but LMNL data Piez. If you are pushing your infrstructure hard enough, the accumulators of XSLT 3.0 will make some things more convenient, but you don’t always have to wait for the language developers; sometimes you can just do it yourself.

The most challenging growth path we are facing is the potential of modifying the XDM. And I think when we look back on this year’s Balisage, many of us will remember the sequence of talks by Eric van der Vlist, Jonathan Robie, and Hans-Jürgen Rennau on different ways of adapting XDM and, with it, the specs that rely on XDM to the advent of JSON [van der Vlist, Robie, Rennau]. One reason to do that is to say, Well, we’re in the situation of anybody doing maintenance. The context has changed. There are people out there who want to use JSON; we need to be able to interoperate with them.

But it’s not always a question of us versus them. Sometimes within the same institution, there will be those who want to work with XML or C++ or C-sharp, and others who want to work with different notations. So, Ari Nordström’s example of finding ways to do things in the notation of your choice as a way of helping keep the peace within an organization is a message we can all take to heart. Nordström. Remember: good fences make good neighbors.

The problem is not always even other people in the same institution. Sometimes it’s we ourselves. We ourselves will want sometimes to use one notation and sometimes to use another. There are a million opportunities and a million choices to make; we will need guidance in the wilderness of standards. Even if we restrict ourselves to standards and recommendations, we’ll need guidance of the kind that Maik Stührenberg and Oliver Schonefeld talked about the other day with their web-based information system about standards Stührenberg, Schonefeld and Witt.

Sometimes the right form for some of our information will use the RDF model. And we will have hybrid systems like the one described by Anna Jordanous, Alan Stanley, and Charlotte Tupman for their ancient documents Jordanous, Stanley and Tupman.

Sometimes the structure of information we want to manage fits neatly into a tree, and those of us with long memories are still so impressed with how much more powerful and interesting trees are than the one-damn-thing-after-another model of documents that preceded SGML that we are astonished that not everybody is happy with trees, and we think that everybody really ought to be content with trees. But, thank God, some of us are professional malcontents. So we should be grateful to those who continue pushing on the issue of overlap and finding the right way to represent our information, even when it has overlapping structures Marcoux, Huitfeldt and Sperberg-McQueen.

Some of the most challenging areas for us when we are looking for the right way to represent our information are those that involve information of multiple kinds for different audiences. Making information for different audiences co-exist is very difficult, often because the natural representation varies even if we’re fully aware that when we say natural representation, we mean the one we’re most familiar with for that kind of information.

That makes literate programming one of the most challenging areas that we face, but it is also one of the most important because, going back to that QA symposium on Monday, if we really want to preserve our information, it’s not just the documents we have to be able to understand and preserve, but the systems that they are built to interact with. And so literate programming as a method of making our programs easier to understand and keeping the documentation in sync with the executable code is crucial. So what Sam Wilmott said, yes Wilmott. What David Lee and Norm Walsh said, yes Lee, Walsh. What Matthew McCormick said, absolutely Flood, McCormick and Palmer. And Mario Blažević’s efforts to bring back SHORTREF and make it work in an environment very different from the SGML environment for which it was originally developed Blažević? If that makes it easier to have natural notations for information, then it will help information longevity.

Syntax is important. I’m always a little nervous when people say Syntax isn’t important; it’s only semantics that’s important because I’m acutely aware that if we don’t have agreement on syntax, it’s extremely unlikely that we will understand the semantics that the other person is trying to tell us about. So, I think syntax is important. But it’s not important for itself; it’s important because it enables the recording and the exchange of semantic information.

In the same way, technology is important, but it’s not important for itself; it’s important because it can help us (or, in some cases, hinder us) in the pursuit of our goals. For the technologies that we in this room care about, that often means it’s important because it helps us manage our information: the information that courses through the veins of our institutions and societies, the information that our organizations care about or that we care about as individuals. The right technology can help us ensure that what changes as time goes by is our technology and not the information we manage with the help of that technology.

Conferences like this one are important, but they’re not important only in and for themselves, and not only for the talks that constitute the official conference program. They’re important because they bring us together as people, and they give us the chance to engage with each other, both in the official program and in the hallways and afterwards. When things work right, conferences help us find solutions to our technical problems.

When everything goes well, conferences — or rather, the people we engage with when we attend conferences — can help us remember why we care about those technical problems: what we care about and why we care about it. They can help us achieve clarity in thinking about how to use our technical skills to serve particular ends, to serve the advancement or preservation of a particular technology, or the creation and management and preservation of the information that our institutions or our societies or our cultures care about, to serve the humanity that has created those institutions, those societies, that culture.

Every year I learn a lot by listening to the talks at Balisage, and every year I learn a lot — or sometimes even more — by engaging with the people who attend and make Balisage what it is. Thank you for being those people. Thank you for coming to Balisage 2012.

References

[Biezunski] Biezunski, Michel. Moving sands: Adventures in XML e-book-land. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Biezunski01.

[Blažević] Blažević, Mario. Extending XML with SHORTREFs specified in RELAX NG. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Blazevic01.

[Braaksma] Braaksma, Abel. Simplifying XSLT stylesheet development using higher order functions. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Braaksma01.

[Brüggemann-Klein, Hahn and Sayih] Brüggemann-Klein, Anne, Jose Tomas Robles Hahn and Marouane Sayih. Leveraging XML Technology for Web Applications. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Bruggemann-Klein01.

[Cagle] Cagle, Kurt. The Ontologist: Controlled Vocabularies and Semantic Wikis. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Cagle01.

[Clark] Clark, Ashley. Meta-stylesheets: Exploring the Provenance of XSL Transformations. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Clark01.

[Cowan] Cowan, John. MicroXML: Who, What, Where, When, Why. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Cowan01.

[DeRose] DeRose, Steven J. The structure of content. Presented at International Symposium on Quality Assurance and Quality Control in XML, Montréal, Canada, August 6, 2012. In Proceedings of the International Symposium on Quality Assurance and Quality Control in XML. Balisage Series on Markup Technologies, vol. 9 (2012). doi:https://doi.org/10.4242/BalisageVol9.DeRose01.

[Dubinko] Dubinko, Micah. Exploring the Unknown: Understanding and navigating large XML datasets. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Dubinko01.

[Flood, McCormick and Palmer] Flood, Mark D., Matthew McCormick and Nathan Palmer. Encoding Transparency: Literate Programming and Test Generation for Scientific Function Libraries. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Flood01.

[Halpern-Hamu] Halpern-Hamu, Charlie. Case study: Quality assurance and quality control techniques in an XML data conversion project. Presented at International Symposium on Quality Assurance and Quality Control in XML, Montréal, Canada, August 6, 2012. In Proceedings of the International Symposium on Quality Assurance and Quality Control in XML. Balisage Series on Markup Technologies, vol. 9 (2012). doi:https://doi.org/10.4242/BalisageVol9.Halpern-Hamu01.

[Halpern-Hamu] Halpern-Hamu, Charlie. Design considerations in the implementation of a boil-this-corpus-down-to-a-sample-document tool. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Halpern-Hamu02.

[Hamilton and Wood] Hamilton, Kate, and Lauren Wood. Schematron in the Context of the Clinical Document Architecture (CDA). Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Wood01.

[Harvey] Harvey, Betty. Developing Low-Cost Functional Class 3 IETM. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Harvey01.

[Holstege] Holstege, Mary. Type Introspection in XQuery. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Holstege01.

[Huitfeldt, Vitali and Peroni] Huitfeldt, Claus, Fabio Vitali and Silvio Peroni. Documents as Timed Abstract Objects. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Huitfeldt01.

[Jordanous, Stanley and Tupman] Jordanous, Anna, Alan Stanley and Charlotte Tupman. Contemporary transformation of ancient documents for recording and retrieving maximum information: when one form of markup is not enough. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Jordanous01.

[Kelly and Beck] Kelly, Christopher, and Jeff Beck. Quality Control of PMC Content: A Case Study. Presented at International Symposium on Quality Assurance and Quality Control in XML, Montréal, Canada, August 6, 2012. In Proceedings of the International Symposium on Quality Assurance and Quality Control in XML. Balisage Series on Markup Technologies, vol. 9 (2012). doi:https://doi.org/10.4242/BalisageVol9.Beck01.

[Kennedy] Kennedy, Dianne. Finally — an XML Markup Solution for Design-Based Publishers: Introducing the PRISM Source Vocabulary. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Kennedy01.

[Lee] Lee, David. CodeUp: Marking up Programming Languages and the winding road to an XML Syntax. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Lee01.

[Malý and Nečaský] Malý, Jakub, and Martin Nečaský. Utilizing new capabilities of XML languages to verify integrity constraints. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Maly01.

[Marcoux, Huitfeldt and Sperberg-McQueen] Marcoux, Yves, Claus Huitfeldt and C. M. Sperberg-McQueen. The MLCD Overlap Corpus (MOC): Project report. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Huitfeldt02.

[Morrissey et al.] Morrissey, Sheila M., John Meyer, Sushil Bhattarai, Gautham Kalwala, Sachin Kurdikar, Jie Ling, Matt Stoeffler and Umadevi Thanneeru. Beyond Well-Formed and Valid: QA for XML Configuration Files. Presented at International Symposium on Quality Assurance and Quality Control in XML, Montréal, Canada, August 6, 2012. In Proceedings of the International Symposium on Quality Assurance and Quality Control in XML. Balisage Series on Markup Technologies, vol. 9 (2012). doi:https://doi.org/10.4242/BalisageVol9.Morrissey01.

[Nordström] Nordström, Ari. Using XML to Implement XML: Or, Since XProc Is XML, Shouldn’t Everything Else Be, Too? Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Nordstrom01.

[Pemberton] Pemberton, Steven. Serialisation, abstraction, and XML applications. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Pemberton01.

[Piez] Piez, Wendell. Luminescent: parsing LMNL by XSLT upconversion. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Piez01.

[Quin] Quin, Liam R. E. Characterizing ill-formed XML on the web: An analysis of the Amsterdam Corpus by document type. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Quin01.

[Quine] Quine, Willard Van Orgman. Two dogmas of empiricism. The Philosophical Review 60 (1951):20-43. Reprinted in W.V.O. Quine. From a Logical Point of View. Harvard University Press, 1953; second, revised, edition 1961.

[Quine] Quine, Willard Van Orgman. Word and Object. Cambridge, MA: MITS Press, 1960 [pages 3-5].

[Raymond, Tompa and Wood] Raymond, Darrell, Frank Tompa and Derick Wood. From Data Representation to Data Model: Meta-Semantic Issues in the Evolution of SGML. Computer Standards & Interfaces 18 (1996): 25-36. doi:https://doi.org/10.1016/0920-5489(96)00033-5

[Rennau] Rennau, Hans-Jürgen. From XML to UDL: a unified document language, supporting multiple markup languages. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Rennau01.

[Robie] Robie, Jonathan. XQuery, XSLT and JSON: Adapting the XML stack for a world of XML, HTML, JSON and JavaScript. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Robie01.

[Ruellan] Ruellan, Hervé. XML Entropy Study. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Ruellan01.

[Stoker and Rose] Stoker, Tamara, and Keith Rose. ACS Publications — Ensuring XML Quality. Presented at International Symposium on Quality Assurance and Quality Control in XML, Montréal, Canada, August 6, 2012. In Proceedings of the International Symposium on Quality Assurance and Quality Control in XML. Balisage Series on Markup Technologies, vol. 9 (2012). doi:https://doi.org/10.4242/BalisageVol9.Rose01.

[Stührenberg, Schonefeld and Witt] Stührenberg, Maik, Oliver Schonefeld and Andreas Witt. A standards-related web-based information system. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Stuhrenberg01.

[Usdin] Usdin, B. Tommie. Things change, or, the `real meaning’ of technical terms. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Usdin01.

[Waldt] Waldt, Dale. Quality assurance in the XML world: Beyond validation. Presented at International Symposium on Quality Assurance and Quality Control in XML, Montréal, Canada, August 6, 2012. In Proceedings of the International Symposium on Quality Assurance and Quality Control in XML. Balisage Series on Markup Technologies, vol. 9 (2012). doi:https://doi.org/10.4242/BalisageVol9.Waldt01.

[Walsh] Walsh, Norman. On XML Languages…. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Walsh01.

[Williams and Cramer] Williams, Jorge Luis, and David Cramer. Using XProc, XSLT 2.0, and XSD 1.1 to validate RESTful services. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Williams01.

[Wilmott] Wilmott, Sam. Literate Programming: A Case Study and Observations. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Wilmott01.

[van der Vlist] van der Vlist, Eric. Fleshing the XDM chimera. Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). doi:https://doi.org/10.4242/BalisageVol8.Vlist01.

[van der Vlist] van der Vlist, Eric. XML instances to validate XML schemas. Presented at International Symposium on Quality Assurance and Quality Control in XML, Montréal, Canada, August 6, 2012. In Proceedings of the International Symposium on Quality Assurance and Quality Control in XML. Balisage Series on Markup Technologies, vol. 9 (2012). doi:https://doi.org/10.4242/BalisageVol9.Vlist02.

[Zhao, Chengan and Bai] Zhao, Wei, Jayanthy Chengan and Agnes Bai. Quality Control Practice for Scholars Portal, an XML-based E-journals Repository. Presented at International Symposium on Quality Assurance and Quality Control in XML, Montréal, Canada, August 6, 2012. In Proceedings of the International Symposium on Quality Assurance and Quality Control in XML. Balisage Series on Markup Technologies, vol. 9 (2012). doi:https://doi.org/10.4242/BalisageVol9.Zhao01.



[1] Or, at least, some of our formal systems don't, logic and philosophy prominent among them. Oddly enough, mathematics has no particular trouble with this, or at least hasn’t since Newton showed how to do describe change mathematically; before Newton, it was apparently just as bad for straight math as for logic today.

[2] I should point out that the discussion of the change and stability in technical publishing offered here is a kind of Just-So story about the origins of descriptive markup intended to illustrate the relevance of change and stability to descriptive markup, and vice versa. It is not intended and should not be taken as a full historical account of all the issues on the minds of those who developed the idea of descriptive markup or the SGML specification.

C. M. Sperberg-McQueen

Black Mesa Technologies LLC

C. M. Sperberg-McQueen is the founder of Black Mesa Technologies LLC, a consultancy specializing in the use of descriptive markup to help memory institutions preserve cultural heritage information for the long haul. He has served as co-editor of the XML 1.0 specification, the Guidelines of the Text Encoding Initiative, and the XML Schema Definition Language (XSD) 1.1 specification. He holds a doctorate in comparative literature.