Introduction

It is probably not a good idea to make historical figures into personal heroes.[1]

Professional historians frown on the practice. But the biographies that children read about figures of the past tend to make them into heroes and encourage the readers to emulate them; that may be part of why professional historians sometimes turn up their noses at any signs of unconditional approval (or disapproval) of any historical figure. The professional historians have a point: simple approval or disapproval is unrealistic. Human beings are mixed; they have flaws, they have virtues. And it’s probably a mistake to think of them in purely positive (or purely negative) terms.

And yet, it’s difficult for anybody — at least it is for me, and I suspect also for many people who like me enjoy reading popular books about the history of mathematics — not to think of some past figures as heroic. I’ve been thinking a lot lately about Gottlob Frege, for example: a mathematician with what we might call philosophical inclinations — I think he is probably read more in philosophy departments now than in mathematics departments — who worked throughout his career to clarify the foundations of mathematics and other important concepts, many with implications far beyond mathematics. It is to Frege that we owe, for example, the distinction between sense and reference which turns out to be crucial for addressing questions like the one Allen Renear mentioned the other day: Giorgione was so called because of his size [Renear 2019]. I am not entirely certain that anyone has solved the Giorgione problem to Allen’s satisfaction, but I suspect that if anyone were to do so they would have to rely at some level on that clarification that we owe to Frege’s work of over a century ago [Frege 1892].

Frege is also dear to my heart because he created a conceptual script, or notation, that was explicitly intended to be a partial realization of the philosophical language that Leibniz and many other people had imagined, a language in which what is now for most of us extremely difficult error-prone reasoning could be reduced to rules so that if you had a dispute with someone, you could just calculate, essentially do the math [Frege 1879].

And of course, there is the pathos of his correspondence with Bertrand Russell. You may know the story: Frege is completing a huge, two-volume work restating the essentials of mathematics using his conceptual writing [Frege 1903]; his aim aim is to clarify the logical underpinnings of every theorem and make sure that all of the inferences are clean and that there are no silent assumptions being smuggled in either unawares or intentionally. Just as he is finishing, he receives a letter from Bertrand Russell which asks a simple question about a set whose members are all of the sets that do not contain themselves. The problem is that Frege’s axioms require that that set exist, and the definition of a set requires that it not exist [audience chuckling]. So volume two of Frege’s Basic Laws of Arithmetic has an afterword that explains that the foundations of the work have collapsed [Frege 1903].[2] [audience laughter] But he seems to have borne it with a certain amount of personal dignity.[3]

I am similarly prone to hero worship of the mathematician David Hilbert.[4] Like Frege, he was dedicated to clarifying the foundations of mathematics and securing its logical foundations. His book on geometry was regarded by his contemporaries as a sort of revelation because it made clearer than had ever been before how each theorem depended on others and on the axioms and what effect it would have, if the axioms were changed in various ways; this provides a more integrated view of Euclidean and non-Euclidean geometry and how they are related [Hilbert 1899].

As an admirer of both Frege and Hilbert, I experienced a certain degree of personal discomfort when I realized that they disagreed on problems that they regarded as fundamental, and I have always had trouble understanding why they disagreed, because they seemed to me to be on the same side of the issue. They were both interested in cleaning up the formality of mathematics and logic. But Frege wrote Hilbert a letter, Hilbert responded in a way that didn’t satisfy Frege, Frege published an open letter to Hilbert, and the result was a rather acrid public dispute. It’s like having two of your friends suddenly start arguing, and you’re disturbed because you want them to like each other.

But in this case, it was even more disturbing that I had trouble understanding what they were disagreeing about, and even more trouble understanding why. I have come now to suspect that it had something to do with what students of logic distinguish as proof theory and model theory. Over-simplifying greatly (which I have to do because I cannot claim any deep understanding of either proof theory or model theory), proof theory is what you get when you focus on the mechanical process of proof: how to construct proofs, how to verify them mechanically — when you treat proof as a mechanical process. I associate this with Hilbert partly because Hilbert’s logical program amounted to saying the only things that matters are the axioms and the rules of inference. The meanings of the words in ordinary language don’t matter. When we read the axioms and theorems of geometry that relate to points, lines, and planes, Hilbert is supposed to have said, we must be willing to understand them as if they referred instead to beer mugs, benches, and tables. As long as the beer mugs, benches, and tables obey the axioms, then all the theorems follow. The only meaning of words like point, line, or plane is given by the axioms and the properties and relations entailed by the axioms.

A later mathematician characterized Hilbert’s view as essentially reducing logic to a meaningless game played with empty symbols.[5] There seems to me to be a serious purpose to that emptying out of meaning: if we can manage to make mathematical proof a meaningless game, then the only thing that will matter will be whether a proof follows the rules of the game. If the symbols have been drained of all meaning beyond what is given in the axioms, then there is much less chance of unstated assumptions sneaking into the proof. The history of attempts at formalizing logic is full of cases where unstated assumptions did manage to sneak in. Euclid’s theorems, for example, don’t actually all follow from his axioms; Frege identified some unstated assumptions that must be made in order for the theorems to follow from the axioms.

But Frege was interested, as I mentioned, in the clarification of concepts. This entails careful consideration of the meaning of terms. It’s not hard to imagine that draining terms of all meaning, in the style of Hilbert, would not appeal much to Frege. Frege wanted logic to be clean and formal, but he is interested in reasoning about real things, not about empty symbols. And he appears to have been troubled by the suggestion that the axioms of geometry might just as well be thought of as applying to beer mugs, benches, and tables, as to points, lines, and planes.

Now, like everything else, I see this tension between proof theory and model theory reflected in the world of markup. We all write specifications of various kinds: we specify vocabularies, we specify languages, we specify software tools. And specifications typically have conformance clauses.[6] When we use specifications in real systems we often need to talk about the question Is X a conformant implementation (or application, or whatever) of Y? We would like that question to have a crisp answer. And we would like it to be consistent among people; we would not like the answer to the question Is this a conforming application of TEI? to depend on the mental state of the person making the judgment.

One way to do that is to ensure that all conformance rules are empirically checkable. That way, assuming that we can agree on what the empirical tests show, we can agree on whether X is conforming to Y or not. So, for example, in the XSD specification version 1.1, there are explicit regular expressions and explicit context-free grammars — go figure — which define the lexical spaces of all of the simple types. This replaces some slightly informal, occasionally handwaving prose in version 1.0. Nobody really complained about the prose of 1.0, but version 1.1 is much cleaner because it is much easier to check mechanically whether a given string is or is not potentially a literal value in a given simple type. So the rules of 1.1 are more easily testable.

Some people say that all conformance rules should be testable assertions. I can see their point. And yet, I am not wholly convinced. XSD also says that for each simple type, the lexical space is the set of finite strings that match the specified grammar or pattern. There are public comments on the spec saying, Really, you shouldn’t say that. You shouldn’t say that they are finite strings because that is not a testable assertion. You can’t write a test to distinguish a processor that successfully excludes all infinite strings! [audience laughter] And therefore it is not a testable assertion, and it shouldn’t be a normative statement. I had to say at least once to the Working Group as a whole and more than once to some individuals in the Working Group, We do need that statement, even though I agree I don’t know how to write that test, because if the strings in question are not guaranteed finite, the regular expressions do not have a defined meaning. If we want the regular expressions to be a testable conformance statement for the lexical space of a simple type, we have to specify that the strings are finite, because if the string is infinite, the regular expression is untestable.

A similar problem comes up in vocabulary conformance. For any large vocabulary designed for wide use and local adaptation — like all the ones discussed on Monday — we often want and behave as if we had rules that allow reasonable customizations [Kimber 2019, Lapeyre 2019, Viglianti 2019, Vitali and Palmirani 2019, Walsh 2019]. We often have a sort of gut feeling of what constitutes a reasonable customization. It’s clear, for example, that HTML is modularized and it’s intended to be modularized in a way that makes it easy for you to adapt it and use parts of HTML in other vocabularies. And it’s clear that the drafters of the HTML modularization spec thought that some adaptations should count as reasonable and conforming, and that others should count as not reasonble and non-conforming. But defining the boundary between what is a reasonable adaptation and what is a non-reasonable adaptation is very difficult. The closest that anybody has come are the vocabularies like DITA, which say the only thing you can do is restriction. This works because restriction, at least, we can mostly test (syntactic restriction). If we want to allow any syntactic extension at all, however, experience shows that it’s very difficult to distinguish reasonable changes from unreasonable changes, and the upshot is that some people end up saying, either out of conviction or because they can’t think of any way to get out of it, Well, the set of legal syntactic changes are anything you can do by redefining any of our parameter entities. And if you have a vocabulary like TEI or JATS or HTML, that essentially means you can gut the entire place, rebuild it from scratch, and call it an adaptation.

The meta-schema approach that Wendell Piez discussed this morning illustrates this problem as well, by the rules it imposes [Piez 2019]. Not only do the XML representations of particular data have to conform to an XSD schema — and the JSON files conform to a JSON schema — but the designers of the system have imposed upon themselves the rule that the two schemas have to say the same things in order to avoid having any rules for JSON files that cannot be checked mechanically by a JSON schema checker, and analogously for the XML side.

On the other hand, JATS people and TEI people and DITA people all say Whatever you do, don’t mess with the semantics, and in those communities we will find many people who want preservation of semantics to be part of the conformance rule. Because messing with the semantics is just beyond the pale; we don’t want it to happen. We want to discourage it, and to the extent that our definition of conformance has any effect in the real world, to the extent that we’re programmed to think of conformance as a positive thing, we don’t want to apply that badge of approval to something of which we disapprove. But we have a problem: nobody knows how to define semantics or the preservation of semantics in a mechanically-testable way.

This may be related to our level of understanding what we’re talking about. If we understand something really, really well, we can often reduce it to a set of mechanically checkable rules. Indeed, we often take that as a measure of understanding. If you can’t reduce something to mechanically checkable rules, we sometimes think and say, then you don’t really understand it. I believe that this intuition may be part of the rationale for the rule that says normative statements should always be testable assertions. On this view, untestable assertions indicate you haven’t really thought it through and don’t really understand your conformance requirements.

I should make explicit here that I’m associating the notion of mechanically performable tasks (like mechanically checkable proofs) with mechanically checkable validity and mechanically checkable conformance and proof theory and Hilbert all on one side, and I am associating everything else — in particular the idea that there may be things that we can’t say explicitly, or at least not in a mechanically-checkable way — with Frege and with model theory. Any listeners or readers who actually understand proof theory and model theory are asked to be kind; if necessary, please regard it as a rather fanciful poetic analogy. Perhaps it will be clearer just to talk in terms of things we can make explicit and do within a particular formal system, and things that are harder to handle that way — things that require that we go outside the formal system.

Working within the system

Now, there are occasional exceptions, but by and large in any system of logical proof what you can manage to prove depends in practice on how well you can use the system. If you’re good at it, you can manage proofs that other people cannot manage. The same thing is true for programming languages: all programming languages are essentially equally expressive (they’re all Turing complete), but what we can manage to write in any given programming language depends on how well we know the language and how good we are as programmers. And similarly, when we work in an XML environment or vocabulary, a lot depends on how well we use the system. So it makes sense that at a gathering like this of people who are concerned with constructing partially formalized, partially checkable systems, we spend a lot of time figuring out better ways to use what we have.

Using the system well

Ari Nordström started us off with a talk on eating our own dog food, in which he described the discovery that if you rely on XML technologies, you can — even if you are not a self-identified programmer — accomplish in declarative ways things that until recently you would have assumed would require a priest, or a programmer [Nordström 2019]. Liam Quin’s talk the other day on how to use the transform function and other new constructs in XPath 3.0 / XSLT 3.0 fits into the same general pattern; his paper is important to us because it helps us use our existing systems better [Quin 2019]. Sasha (Alexander) Schwarzman gave a paper the other day on using BITS, the Book Tag Set, for retro-conversion of conference proceedings; I was a little surprised by the approach because I would not have thought that BITS was the obvious way to go there, but he showed that if you look carefully, it is; and it works very well for his project [Schwarzman 2019].

John Boyer really opened my eyes the other day. I had seen the delay attribute in the XForms specification, but I had never realized that one could use it to do multi-tasking [Boyer 2019]. When I first saw the paper title, I assumed he was going to be talking about some new feature in XPath 2.0; only when I read it in full did I realize that the functionality in question has been available for some years. It was there all along, but I had never understood multi-tasking well enough to see the applicability.

Another good example of improving our command of our tools is given by Josh Lubell’s paper, in which he uses SCAP (the Security Content Automation Protocol) as a sort of case study of the way XML technologies can be used to make useful tools for very important and very specialized environments and applications [Lubell 2019]. Eliot Kimber gave us a brilliant discussion of how to CSS and existing formatters [Kimber 2019] to solve problems that I remember thinking of (before the other day) as being essentially insoluble, except with the aid of multi-million-dollar formatting systems from forty years ago [audience chuckling]. This morning, David Birnbaum, Hugh Cayless, Joe Wicentowski, and their co-authors provided a master class in the exploration of how best to use the existing technology — the existing systems — to do things that need doing [Birnbaum et al. 2019].

Patrolling the boundaries

All systems have boundaries, and in the case of vocabularies those boundaries are often defined by a schema. So, validation is important, and I think Mary Holstege’s talk on the validation of JSON [Holstege 2019] is important and suggestive, and not only because of her elegant application of Brzozowski derivatives. It’s important even if it is just JSON, because as has been pointed out here, JSON is, like relational database systems, potentially a descriptive markup technology. We should be willing to use it and describe it as such. And even when JSON is not used in a particularly declarative or descriptive way, the fact that it has a simple, explicit, easy-to-parse syntax — well, one of several simple, easy-to-parse syntaxes, to be exact [audience laughter] — that is worth something.

If you hesitate to believe that easy parseability matters, think back. Many people in this community were taken aback a bit when we first saw the XML format used by Microsoft Office when it became visible fifteen or twenty years ago. I remember people saying, Oh, my gosh, we hadn’t realized you could use XML that way! Many people were not immediately convinced that it was a particularly good way to use XML, but over time it has become clear that even if many of us would never design a vocabulary that way[7], the world is a much much better place because that material is accessible in XML. I can work seriously with Word documents in ways that I could not do before because I wasn’t willing to put in the time to become a Word Basic wizard. For that basic parseability of Microsoft Office documents, I am eternally grateful to Jean Paoli; we owe him a deep debt for the years he spent going from product team to product team, persuading them that XML was worth supporting.

And of course, Jean’s talk here the other day suggests an importantly different approach to validation, an approach that I think we all have to think about hard [Paoli 2019]. In any system that uses deep learning and statistical learning and AI approaches of any kind, we are going to have to be prepared for noise, because machine learning tools tend to be noise-tolerant systems, and to some extent also noise-generating systems. Our attitude toward validity and toward anamolies will have to adjust accordingly. That will be a kind of heavy lift for some of us — for me, at least — but I think it’s a mental shift worth working at. Our attitude towards anomalies — well, Bethan Tovey spoke eloquently about that the other day [Tovey 2019]. It’s important that we not dismiss statistical outliers either as less valuable or as irrelevant because often the outliers are precisely the interesting things, as David Birnbaum pointed out, gosh, 22 years ago, I think [Birnbaum 1997, Birnbaum/Mundie 1999]. [Addressing DB] Some of us are starting to catch up to where you were 22 years ago, but I fear that we may just be recreating Achilles and the tortoise in slow motion.

Metalinguistic reflection

Of course, our attitude toward validity is likely to vary, not just based on whether we’re thinking top-down or bottom-up, and on whether we’re using AI or not, but also based on what we think validity or schemas are for. The other day Wendell Piez, and then the group as a whole, identified several views of schemas that are intertwined but I think worth distinguishing:

  • schema as a go/no-go test

  • schema as a contract between data producer and data consumer (the schema is what determines whose field engineer gets to go out to the gas station in the middle of the night and fix the software that broke)

  • schema as tool for configuration

  • schema as means of communication and documentation

(This is of course not an exhaustive list of the purposes to which schemas can be put.)

The fertility of that exercise suggests to me that meta-linguistic reflection in general is going to be useful, and so I was grateful to Jeff Beck for his careful, detailed discussion of the problems that arise in one particular context, when one group of people are making rules for another group of people who are making rules for a third group of people who are going to be writing schemas and documents [Beck 2019]. Debbie Lapeyre’s survey of the history of vocabulary-making described a really remarkable shifts in attitude toward the making of vocabularies and toward the style of vocabularies that people should make and the way that they should be constrained and documented [Lapeyre 2019]. Bethan Tovey’s paper on the theory of markup and part-of-speech tagging and the way they relate gave me new perspectives on things, and I hope you as well [Tovey 2019]. And then, of course, there was Allen Renear on the problem of encoding [Renear 2019]. It seems clear to me that Bethan Tovey and Allen Renear will both line up with Frege as analysts of concepts and against the empty game with symbols team — it’s good to have them on our side [audience chuckling].

Planned variation / customization / extension

One way to avoid having any schema turn into an unwanted straitjacket is, of course, to plan to let it be adjusted. So all of the talks at the symposium that Syd Bauman organized on Monday are relevant here [Kimber 2019, Lapeyre 2019, Viglianti 2019, Vitali and Palmirani 2019, Walsh 2019]. Syd’s talk here on the TEI customization for TEI customizations illustrated the problem of suppporting variation from a very particular and concrete and practical point of view [Bauman 2019]. Liam Quin’s talk on extension and customization attempted a broader and more systematic view [Quin 2019]. And Wendell Piez’s paper on meta-schemas this morning described a method for defining, not a markup language, but a family of languages for related artifacts [Piez 2019].

Making the system stronger

Now, back to Hilbert for a moment. There is some appeal in his method for restricting oneself rigorously to what can be made explicit and to processes or proofs that can be checked mechanically. But if we follow Hilbert in making proof and mechanizable processes into meaningless games with symbols, we risk eliciting a follow-up question that I have never known how to answer: Why should we be playing this particular game and not some other meaningless game with the same or different symbols? Why these rules of proof and not these other rules of proof? In practice, the answer is that we want proofs to guarantee the truth of the consequences given the truth of the premises. Those are, so to speak, the meta-rules for the game.

But if we started from the position that our work (whether mathematics or markup) is an empty, meaningless game with symbols, where did those meta-rules come from? Why should we play a game that follows a particular set of meta-rules and not one that violates them? What can it mean to say that a particular meaningless game with symbols guarantees the truth of the consequences, given the truth of the premises? If we care about preservation of truth as a meta-rule, then maybe the game is not meaningless after all. I don’t how it could be. So perhaps meaninglessness is not a property of the game, but only the proper spirit in which to play it? I could understand that, but that wasn’t what Hilbert seemed to be saying at the outset.[8]

For that reason, the empty-game view of mathematics and of markup seems problematic on its face to me. I heard once about a Working Group meeting where the group was arguing about a proposed change to their specification. And at one point someone supported the change asked one of the opponents to justify the retention of the existing text from version 1.0 of the specification. What, they asked, is this paragraph trying to accomplish? To which the answer was What it is trying to accomplish is being compatible with the 1.0 specification. Now since this was the text of the 1.0 specification they were talking about, I personally would have thought that was a rather low bar for measuring the usefulness of a paragraph. (I think in fairness to the individual involved, the individual must have meant to answer not the question What is the text trying to do? but Why do you not want to change it? But I have certainly been in Working Group situations where it seemed to me that the point of any particular piece of prose was to be that particular piece of prose, and it made it hard for me to evaluate it in the same way that it’s hard for me to explain, if logic is just a meaningless game with symbols, why we should use rules that preserve truth under logical inference, and not other rules.

In theory, it doesn’t matter which programming language we use because they’re all Turing complete and what we can say in one, we can say in the other. But the subjective experience of using different languages varies a lot, and the same thing applies to markup languages. Given things like the @class attribute in HTML; or the @type attribute in TEI (which is almost but not quite universal); the @specific-use attribute of JATS; or the @role attribute of DocBook — given those semantic escape hatches, it seems clear that for a very large body of documents, there is no difference in expressive power among these vocabularies. What you can express in one, you could express in another in a loss-less way. You could develop a pair of round-trip transformations to go from one of these vocabularies to another, and back, and get back a document with the same semantics. But the subjective experience of using the vocabularies is quite different, because some of those ways of rendering the meaning will be extremely inconvenient, and others will not. So it makes sense that in addition to teaching ourselves to use our systems as well as we can, we spend a lot of time working on making our systems as capable as possible and extending their boundaries, extending their expressive power, and extending their convenience.

The work by Anne Brüggemann-Klein, Christina Grubmüller, Philipp Ulrich, and their absent collaborator is a good example here [Al-Awadai et al. 2019]. They have found ways to make WebSockets available to the XML stack, and to derive automatic state-checking procedures from a statechart XML document. Those techniques will make a huge improvement in the lives of anyone who uses the XML stack to build the kind of applications that the authors are talking about. And does anyone now build applications that don’t run in the web? For those people it might not be quite so important, but I think for the 99% it is very important. Hugh Cayless’s work on making TEI standoff markup work in the browser [Cayless 2019] marks a similar advance in convenience for users of the XML stack. Hugh was politely diffident about his work and maintained that it still has a way to go, but I think he has gotten quite a distance, and I look forward to seeing him get the rest of the way. Ashley Clark’s report from the Women Writers Project described their tools and a process of gradual improvement over time in their ability to perform the extremely difficult, deceptively simply-described task of showing the user all the words in the text [Clark 2019]. For those of us who occasionally struggle when we attempt to advance the state of our tools, it is inspiring to see such consistent, assiduous work even in cases where every inch of progress costs an awful lot of effort. And for the same reason I was extremely grateful to Norm Walsh and Gerrit Imsieke for their presentation on XProc 3.0 [Walsh and Berndzen 2019]: as someone who falls both into the class of people who have successfully used XPath 1.0, and into the class of people who have crashed and burned spectacularly in efforts to use XPath 1.0, I am extremely grateful for the prospect of an XProc language that is more usable and more powerful.

A different boundary between our explicitly formalized systems and the rest of the world was addressed by Jim Mason, when he traced the history of the idea that at least some of what is implicitly present in a text might be made explicit without overt markup [Mason 2019]. To be honest, my experience with SGML markup minimization was that it was always error-prone when I tried it, so I never got the results I hoped for [audience chuckling]. But the idea is nevertheless a compelling one, as illustrated by its perpetual recurrence in new forms. Equally compelling is the idea behind Sam Wilmott’s talk: that if our languages define the worlds in which we can think, then defining a new language amounts to defining a new world [Wilmott 2019]. I think we should all be grateful to Sam for showing us the prospect of a new world in which to think and work.

Often, I think we can think of the work we’re trying to do as trying to take some problem, or some application domain, and move it from the realm of the implicit and unspecified and inexplicit into the light, into a place where we can name things and make things explicit and, in lucky cases at least, formalize them. But there are cases where we may never make it to complete explicitness. Chandi Perera’s talk on accessibility the other day gave us a good example of a case where validity and success in passing every mechanical test we can devise is perhaps a necessary condition, but is not a sufficient guarantor of accessibility. Why? Because accessibility involves the communication of understanding to the reader — making information accessible to the reader — and it is questions of meaning and understanding and semantics that we have the hardest trouble formalizing.

Conclusion

I began by talking about Frege and Hilbert and their varying approaches to formal logic. Some of you may have been saying, Wait. How can you talk about formal logic at the turn of the century and mention Frege and Hilbert and not mention Kurt Gödel? Well, you’re right; I can’t [audience laughter]. Kurt Gödel proved a number of important theorems in his life; he is best-known for his two incompleteness theorems [Gödel 1931]. But the first major result that he proved and announced at a scholarly conference was a completeness theorem. He proved the completeness of first-order predicate calculus, that is, roughly the logic that Russell and Whitehead defined in their Principia Mathematica [Whitehead and Russell 1910-1913] and was formalized in different ways later by Hilbert. And that completeness theorem is worth thinking about for a moment. What he proved was that every statement which is true in all possible worlds, that is, under every possible interpretation of a particular set of logical sentences, is provable; and every statement that is provable is true in all possible interpretations. So what he showed was that proof is complete for first-order predicate calculus. And then the next day in a closing discussion at that conference, he got up and said, Oh, and by the way, I’ve also proved that it’s incomplete. [audience laughter]

Now, I won’t attempt to teach you all about Gödel’s first incompleteness theorem. Some of you will have read popularizations about it, as have I; it amounts roughly to showing first how we can encode — I’m going to use Allen Renear’s term — any logical expression in a formal, logical language, which he [Gödel] defines, as a number, and how we can construct a sentence whose meaning can be informally paraphrased as The sentence with this number is not provable. Now that is similar in some ways to Russell’s set that doesn’t contain itself in that neither of the two obvious ways to think about the sentence are quite satisfactory. If the sentence is true, then what you have established is that there are true sentences that are not provable within the system: the system is incomplete. And if the sentence is false, what you’ve established is that the sentence is actually provable, although we don’t necessarily have a proof. But that would mean that we could prove false statements. You can see how anybody would be kind of upset [audience chuckling] once they got their head around it. By all historical accounts nobody got their head around it at that meeting, so there was no cry of shock or dismay. John von Neumann apparently understood it on the train ride home and immediately wrote a letter saying, Oh, well, if that is true, it would follow that you cannot prove the consistency of first-order logic. And Gödel replied, Yes, that is quite true. I didn’t mention that result but I did achieve it before I got your letter [audience laughter].[9]

How can it be, however, if first-order logic is complete, that we can have a Gödel sentence that is true but unprovable? The answer appears to be: the sentence has the meaning assigned only when the axioms in question are interpreted as applying to numbers. There is a whole structure of interpretation — I’ll take a risk and say meaning — associated with the sentence there. Gödel, it is worth pointing out, did not regard the sentence as undecidable; he regarded it as clearly true. He had shown the sentence to be true by a line of argument that he thought was essentially irrefutable and which has certainly never been refuted. And therefore there is a sense in which he had proved that the sentence was not provable. It’s not provable within that one system. It’s provable in a larger system, and a system, be it noted, in which the meaning that we are associating with the sentence that is involved and the numbers involved is relevant.

There is at least one philosopher who identifies Gödel’s work as the beginning of model theory [Hintikka 2000]. It is thus more than a little ironic that throughout his life Gödel exhibited not the slightest interest in model theory. He had created it, but only as a side effect. He was really only interested in proof theory. Gödel also believed that the moral of the story is not what some of the cool kids in the comparative literature department used to make of it, which is Oh, all human systems are incomplete, the world is dark, etc., etc., life is hard [audience laughter]. For Gödel, the moral of the story was: human reason cannot be reduced to a mechanical game; it always transcends the purely mechanical. Mathematics will always be interesting because there will always be a need for human reason.

At the beginning of the conference, Tommie Usdin said that she still hoped for a world in which explicit markup is possible and helps people get their work done [Usdin 2019]. I share that wish, as I expect most of you do too. I think we may never get to the point of making everything explicit; there may be some things that we just don’t know well enough to make them explicit — make them formalizable in the way that is necessary. But as Chandi Perera said about accessibility, a little bit is a lot better than none at all. And we can try. We can try to extend the boundaries of our understanding, the boundaries of our power to make things explicit and work with those things. And one of the most effective way we can do that is to help each other by getting together every now and then and talking about what we have done, both from a practical point-of-view and from a theoretical perspective because, after all, there is nothing as practical as a good theory.

References

[Al-Awadai et al. 2019] Al-Awadai, Zahra, Anne Brüggemann-Klein, Christina Grubmüller and Philipp Ulrich. Graphical user interfaces in the X stack. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Bruggemann-Klein01.

[Bauman 2019] Bauman, Syd. Markup Vocabulary Customization: Introduction. Presented at Symposium on Markup Vocabulary Customization, Washington, DC, July 29, 2019. In Proceedings of the Symposium on Markup Vocabulary Customization. Balisage Series on Markup Technologies, vol. 24 (2019). doi:https://doi.org/10.4242/BalisageVol24.Bauman01.

[Bauman 2019] Bauman, Syd. A TEI customization for using TEI customizations. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. (Originally presented at TEI 2017)

[Beck 2019] Beck, Jeffrey. Rules for the Rulemakers: JATS4R’s Self Guidance on Attributes. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Beck01.

[Birnbaum 1997] Birnbaum, David J. In defense of invalid SGML. Presented at ACH/ALLC ’97, Joint annual conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing, Queen’s University, Kingston, Ontario.

[Birnbaum/Mundie 1999] Birnbaum, David J., and David A. Mundie. The problem of anomalous data: A transformational approach. Markup Languages: Theory & Practice 1.4 (1999): 1-19. doi:https://doi.org/10.1162/109966299760283157.

[Birnbaum et al. 2019] Birnbaum, David J., Hugh Cayless, Emmanuelle Morlock, Leif-Jöran Olsson and Joseph Wicentowski. The integration of XML databases and content management systems in digital editions: Understanding eXist-db through Reese’s Peanut Butter Cups. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Birnbaum01.

[Boyer 2019] Boyer, John M. Multitasking algorithms in XForms: Coordination, progress, priority. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Boyer01.

[Cayless 2019] Cayless, Hugh. Implementing TEI Standoff Annotation in the browser. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Cayless01.

[Clark 2019] Clark, Ashley M. "With One Voice:" A Modular Approach to Streamlining Character Data for Tokenization. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Clark01.

[Frege 1879] Frege, Gottlob. Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Halle a. S.: Louis Nebert, 1879. Frequently reprinted and translated; a convenient English translation is available in the collection From Frege to Gödel: A Source Book in Mathematical Logic, 1879–1931, ed. J. van Heijenoort. Cambridge, MA: Harvard University Press, 1967.

[Frege 1892] Frege, Gottlob. Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik 100 (1892): 25–50. Frequently reprinted and translated.

[Frege 1903] Frege, Gottlob. Grundgesetze der Arithmetik. Translated and edited as Basic Laws of Arithmetic by P. A. Ebert and M. Rossberg, with C. Wright, with an appendix by R. T. Cook. Oxford: Oxford University Press, 2013. pp.680. ISBN:978-0-19-928174-9.

[Gödel 1931] Gödel, Kurt. Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I. Monatshefte für Mathematik und Physik 38 (1931): 173-198.

[Hilbert 1899] Hilbert, David. Grundlagen der Geometrie. Leipzig: Teubner, 1899. Translated by E. J. Townsend as Foundations of Geometry. Reprint edition. La Salle, Illinois: The Open Court Publishing Company, 1950. Free eBook available at Project Gutenberg.

[Hintikka 2000] Hintikka, Jaakko. On Gödel. Belmont, CA: Wadsworth Thomson Learning, 2000.

[Holstege 2019] Holstege, Mary. Application of Brzozowski Derivatives to JSON Schema Validation. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Holstege01.

[Kimber 2019] Kimber, Eliot. DITA Grammar Customization: Enabling controlled grammar extension for loosely-coupled interchange and interoperation. Presented at Symposium on Markup Vocabulary Customization, Washington, DC, July 29, 2019. In Proceedings of the Symposium on Markup Vocabulary Customization. Balisage Series on Markup Technologies, vol. 24 (2019). doi:https://doi.org/10.4242/BalisageVol24.Kimber02.

[Kimber 2019] Kimber, Eliot. Loose-leaf publishing using Antenna House and CSS. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Kimber01.

[Lapeyre 2019] Lapeyre, Deborah A. Customizing JATS (Journal Article Tag Suite). Presented at Symposium on Markup Vocabulary Customization, Washington, DC, July 29, 2019. In Proceedings of the Symposium on Markup Vocabulary Customization. Balisage Series on Markup Technologies, vol. 24 (2019). doi:https://doi.org/10.4242/BalisageVol24.Lapeyre01.

[Lapeyre 2019] Lapeyre, Deborah A. Bespoke, bewildered, and bebothered. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. (Originally presented at XML London 2017)

[Lubell 2019] Lubell, Joshua. SCAP Composer: A DITA Open Toolkit Plug-in for Packaging Security Content. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Lubell01.

[Mason 2019] Mason, James David. Do we really want to see markup? Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Mason01.

[Nordström 2019] Nordström, Ari. Eating Your Own Dog Food. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Nordstrom01.

[Nordström 2019] Nordström, Ari. Merging the Swedish Code of Statutes (SFS). Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. (Originally presented at XML Prague 2019)

[Paoli 2019] Paoli, Jean. We Created Document Dysfunction: It Is Time to Fix It. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Paoli01.

[Perera 2019] Perera, Chandi. Accessibility: Not Just a Good Idea. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Perera01.

[Piez 2019] Piez, Wendell. The Open Security Controls Assessment Language (OSCAL): Schema and Metaschema. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Piez01.

[Reid 1996 (1970)] Reid, Constance. Hilbert. 1st Printing edition. New York: Copernicus, 1996. pp.272. ISBN:978-0387946740. (Copernicus is an imprint of Springer-Verlag; this is a reprint of the 1970 work.)

[Quin 2019] Quin, Liam. Extending Vocabularies: The Rack and the Weeds: Social Context and Technical Consequence. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Quin01.

[Quin 2019] Quin, Liam. XSLT 3, fn:Transform() and XProc: which, when, why. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. (Originally presented at XML Prague 2019)

[Renear 2019] Renear, Allen H. Encoding. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Renear01.

[Schwarzman 2019] Schwarman, Alexander B. and Jennifer Mayfield. Using BITS for conference paper conversion. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Schwarzman01. (Originally presented at JATS-Con 2019)

[Sperberg-McQueen 2019] Sperberg-McQueen, C. M. Aparecium: An XQuery / XSLT library for invisible XML. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Sperberg-McQueen01.

[Tovey 2019] Tovey, Bethan Siân. You’re not the POS of me: part-of-speech tagging as a markup problem. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Tovey01.

[Usdin 2019] Usdin, B. Tommie. Explicit markup: a fool’s errand or the next big thing? Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Usdin01.

[Viglianti 2019] Viglianti, Raffaele. One Document Does-it-all (ODD): a language for documentation, schema generation, and customization from the Text Encoding Initiative. Presented at Symposium on Markup Vocabulary Customization, Washington, DC, July 29, 2019. In Proceedings of the Symposium on Markup Vocabulary Customization. Balisage Series on Markup Technologies, vol. 24 (2019). doi:https://doi.org/10.4242/BalisageVol24.Viglianti01.

[Vitali and Palmirani 2019] Vitali, Fabio, and Monica Palmirani. Akoma Ntoso: Flexibility and Customization to Meet Different Legal Traditions. Presented at Symposium on Markup Vocabulary Customization, Washington, DC, July 29, 2019. In Proceedings of the Symposium on Markup Vocabulary Customization. Balisage Series on Markup Technologies, vol. 24 (2019). doi:https://doi.org/10.4242/BalisageVol24.Palmirani01.

[Walsh 2019] Walsh, Norman. Customizing DocBook. Presented at Symposium on Markup Vocabulary Customization, Washington, DC, July 29, 2019. In Proceedings of the Symposium on Markup Vocabulary Customization. Balisage Series on Markup Technologies, vol. 24 (2019). doi:https://doi.org/10.4242/BalisageVol24.Walsh01.

[Walsh and Berndzen 2019] Walsh, Norman, and Achim Berndzen. XProc 3.0. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Walsh02.

[Whitehead and Russell 1910-1913] Whitehead, Alfred North, and Bertrand Russell. Principia mathematica. Cambridge: CUP, 1910, 1912, 1913.

[Weyl 1925] Weyl, Hermann. Die heutige Erkenntnislage in der Mathematik. Symposion 1 (1925): 1-23. Reprinted in Weyl, Gesammelte Abhandlungen vol. 1 (Berlin: Springer Verlag, 1968), pp. 511-542. English translation in From Brouwer to Hilbert: The Debate on the Foundations of Mathematics in the 1920s, ed. Paolo Mancosu (Oxford: Oxford University Press, 1998), pp. 123-142.

[Wilmott 2019] Wilmott, Sam. Beyond OmniMark. Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 – August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). doi:https://doi.org/10.4242/BalisageVol23.Wilmott01.

[Zach 2003] Zach, Richard. Hilbert’s Program. The Stanford Encyclopedia of Philosophy (Fall 2019 Edition), ed. Edward N. Zalta. Available at https://plato.stanford.edu/archives/fall2019/entries/hilbert-program.



[1] Thanks are due to Tonya Gaylord for transcribing the recording of this talk and tracking down concrete citations for works I had referred to only with handwaving. In revising the work for publication, I have tried to salvage the syntax of some sentences and make the text a little more readable, but in all substantive points the argument is for better or worse the same as what was presented at the conference. Descriptions of audience reaction are the transcriber’s.

[2] It is not, as I originally suggested, a little note at the beginning, but thirteen pages at the end of the volume, which outline the problem, exhibit the contradiction both in prose and in Frege’s conceptual notation, and illustrate ways in which proofs can be rescued.

The opening paragraph reads Einem wissenschaftlichen Schriftsteller kann kaum etwas Unerwünschters begegnen, als dass ihm nach Vollendung einer Arbeit eine der Grundlagen seines Baues erschüttert wird. (Nothing less welcome to a scholarly author can happen, than that after the completion of a work one of its foundations is shaken (my translation).

[3] I am not sure I agree with Frege’s politics (by which I mean that I find his political views horrifying), but I blessedly don’t have to worry about his politics because we don’t live in the same political entity. It’s easier to make heroes of people if you don’t have to encounter them in real-life situations.

[4] To anyone who enjoys popular (i.e. non-technical) books about mathematics, I can earnestly recommend Constance Reid’s biography of Hilbert [Reid 1996 (1970)]. It is very difficult to read Reid’s biography without sort of falling in love with Hilbert. Or at least, with the idea of Hilbert.

[5] The Stanford Encyclopedia of Philosophy tells me now that this was Hermman Weyl [Zach 2003, Weyl 1925].

[6] In some unfortunate cases, specifications don’t have conformance clauses, but in those cases we often treat them as if they did. When we do, we typically act on some idea of what the conformance clause would say, if it existed; that doesn’t work out very well all the time, because different people imagine different implicit conformance rules.

[7] Memo to self: Don’t write a complicated C++ program and then write your schema by copying your struct definitions. That may not produce an optimal solution.

[8] At least, not on Weyl’s account. I do not know whether the difficulty I have just described is a difficulty with Hilbert’s program or only with Weyl’s pointed description of the program.

[9] The story is told by Jaakko Hintikka [Hintikka 2000].

C. M. Sperberg-McQueen

Founder and principal

Black Mesa Technologies LLC

C. M. Sperberg-McQueen is the founder and principal of Black Mesa Technologies, a consultancy specializing in helping memory institutions improve the long term preservation of and access to the information for which they are responsible.

He served as editor in chief of the TEI Guidelines from 1988 to 2000, and has also served as co-editor of the World Wide Web Consortium’s XML 1.0 and XML Schema 1.1 specifications.