In a recent conversation with a fellow markup enthusiast, I found myself saying Yes, I agree that making the information products they want from their content would
be easier, and the products would be better, if they authored in a way that captured
the structure and some of the meanings as they wrote. But it isn’t going to happen.
Ever. Because the average person writing a document is not thinking about the process
of writing the document, or the structure of the document, or how they might want
to use or reuse the document, they are thinking about the subject matter of the documents.
They want to use a process to create the document that is as transparent as possible.
This means that not only will they use the popular-at-the-moment authoring tool, they
will use it as thoughtlessly as possible.
I do not say this as a criticism of these people, it is simply as observation of the way they do, and probably should, write.
That conversation brought back a conference paper I heard many years ago, from Brian Reid.
Brian Reid
How many of you were at Markup Technologies ’98 in Chicago? At that conference we invited Brian Reid to keynote.
Aside: do you know who Brian Reid is? If you play in the markup space you probably should.
As I recall, it took some persuasion to get him to go to Chicago and talk to a markup
conference. He said that he had spoken at the Conference on Research and Trends in Document Preparation Systems in Lausanne, Switzerland, in 1981 and said what he had to say to the SGML world.
But … he paused … he had been wrong in 1981 and would be happy to tell us why in 1998.
He found the plastic transparencies he had used as the visuals for his talk in 1981,
scanned them, and used these scans as the basis for his 1998 talk. They are on his
website to this day at: 20 years of abstract markup - Any progress?
[Reid, 1998] (We’ll look at a few of them in a moment — and the whole set is the appendix to this paper.)
In 1981 he was singing the praises of what we now call declarative markup. He talked about the goals of Scribe:
These slides would fit perfectly into the XML Basics for Text Processing
class my group will be teaching in a few weeks. This is EXACTLY the song many of
us sing for a living in 2019. This talk was in 1981. He was talking about a working
tool that:
-
Used markup to identify parts of documents by what the information was
-
Separated content from format
-
Was platform independent
I believe that that conference in 1981 was the conference at which the SGML effort was announced. That is, work on SGML had just begun, and Brian Reid had a working tool and the philosophy underlying the tool, that met many of the goals of SGML and now XML. (Do you begin to see why I think we should all know about the work this man has done?)
Note, however, that these are Brian Reid’s 1981 slides.
He revisited the ideas in 1998:
He continued with:
Declarative markup (SGML, XML, Scribe, or some other syntaxes) was a lost cause. People were not going to do it. People did not want to think about structure. People were not capable of thinking about structure.
He proved this by sharing some user interface research that some well-known organization (I don’t remember who and his slides don’t say) had done.
People were shown this image:
Then they were asked what would happen if a user clicked on the X and hit delete.
He told us that most average users
expected this:
Aaaargh! PROOF that people are NOT capable of understanding declarative markup. Proof that we are wasting our time talking about, and wishing for, success of generic-markup-based tools.
End of Story. Game Over. Go Home!
Really. End of Story. Game over. Go home! Time to give it up as a bad investment.
Teaching Reading
Perhaps the reason I remember his paper so well is that I don’t want to agree with him. I don’t think declarative markup is impossible, I haven’t given up, and as far as I’m concerned the game is NOT over.
It reminds me of that time, a very long time ago, when I was a student, and I volunteered
with an adult literacy program. The organizers were mostly social workers, and most
of the teaching was done by college students; the participants were mostly men in
their 50s who were making their way in the world without the ability to read — and
generally successfully hiding that fact from their families, friends, and employers.
They were smart, hard-working, and generally affable people. It usually took some
breakdown in their lives to get them to admit that they couldn’t read, to make the
time to go to the literacy workshop, and probably most difficult, to accept help from
college students. Some had been unable to complete the tests at the end of a court-ordered
educational program, some couldn’t apply for drug rehabilitation or financial assistance,
some were shamed by children or grandchildren who asked them to read me a story
. They were highly unusual; most of their peers (adults in the United States who
cannot read) lie, cheat, bluff, and hide to avoid admitting that they cannot read,
and (even if detected) will find every possible excuse not to do anything as humiliating
as going to a literacy program.
The program started with the alphabet, followed by what I think of as phonics-lite. That is, a rough guide for how to pronounce many letters and letter combinations, and how to look at a word and figure out what it probably sounds like and say it.
After the first day or two, we were down to about half the participants. Most of them
had either decided that they were too stupid to learn to read, or that it was too
late for them, or that they had more important things to do with their Tuesday evenings,
or that they knew what the word on the Stop
sign was, and that was reading so they could already read and didn’t need this program.
We had one or two who insisted that it was all a trick and nobody could actually make
sense of those little letters on the page, and it didn’t matter anyway. If I cook good enough that you pay $2 for a bowl of my chili, why do I need to read?
one asked me.
They fought English phonics like it was a monster. It is not fair, we were told, that some letters were silent — sometimes — and that the same letter combination would make different sounds in different contexts. They wanted reading to be like a game: they wanted clear rules, and they wanted it to be fair. I sympathized. But it isn’t. English may be especially challenging in the respect, but other languages have their own challenges.
For many of our participants, heteronyms (words that are spelled the same, pronounced differently, and have different meanings) were the last straw. (He has tear in the eye when told to tear a page out of the book.) They couldn’t, they wouldn’t, NOBODY could, do this. It was impossible. End of story. Game over. Go home!
And yet … there were pressures. The grandchildren who wanted stories, the judge who had suspended a sentence on condition of completing the program, the need for independence, the heavy burden of pretending and lying and bluffing to get by. Some of them stuck it out.
The participants were paired with reading partners
(the college students) and a book. We started, word by word, to sound out the words,
then sentences, in the book. We would figure out each word in a paragraph, then go
back and read
the whole paragraph. The first few pages took days. It was excruciating — but by
the third or fourth session the participants were beginning to get comfortable with
the process. They were encouraged!
What were we reading?
Not the first readers their grandchildren were using in grade school! We were reading Mickey Spillane detective novels. (For those of you who have missed these literary classics, Mickey Spillane is the name under which a LOT of detective novels and short stories were published, starting in the 1950s. They are short, trite, usually violent, sexy but not explicit, and sexist even for the time in which they were written. They use short words, short sentences, have a fairly small vocabulary, and are definitely not appropriate for children.) The Big Kill starts:
It was one of those nights when the sky came down and wrapped itself around the world. The rain clawed at the windows of the bar like an angry cat and tried to sneak in every time some drunk lurched in the door. The place reeked of stale beer and soggy men with enough cheap perfume thrown in to make you sick.
Two drunks with a nickel between them were arguing over what to play on the juke box until a tomato in a dress that was too tight a year ago pushed the key that started off something noisy and hot. …
In retrospect, I think selecting hard boiled detective fiction
as the reading material for these farmers, mechanics, painters, and laborers was
an act of genius.
Each session ended with a pep talk: Look how far you’ve come, look how much you got through today, you are making real
progress.
And these wonderful patient men would nod, smile, and mutter I still can’t read and don’t think I ever will.
One evening, usually about 6 or 7 sessions into the guided reading part of the program, the team would get to a spot in the story where something exciting was just about to happen. The bad guy was going to shoot the good guy, the detective was about to expose the girl as a spy, or the girl was going to climb into bed with the detective. The bomb was about to go off if our hero didn’t disarm it quickly enough, or the car was heading for a cliff. The teacher would excuse him/her self for a moment — and vanish for a very long time.
When we got back we asked Did he buy her a drink or shoot her?
And the no-longer-totally-illiterate participant knew the answer! The would-be victim’s
mother had come to the door, the sheriff stepped out from a shadow, or … something.
The participant had stopped thinking about reading and started thinking about the
story, and suddenly was reading!
The big problem in teaching reading is that as long as you are thinking about reading you are not, you cannot be, reading. Try it. Try reading something while concentrating on the activity of reading. While you are thinking about reading, you are not reading; in order to read, you have to let go of the process and focus on the content you are reading.
Once the participants had made that leap once, it was relatively easy to get them to do it again and again. From there all it took was a few proactive sessions reinforcing the lesson, reading a newspaper, a few government forms, and a children’s book or two. (Books with nonsense words were particularly challenging for new readers.) They had been exposed to the written word their whole lives and had probably picked up a lot of reading basics without knowing it. All they needed was help over one (huge) logical hurdle.
Back to Declarative Markup
So, it wasn’t End of Story. Game Over. Go Home! It was time to learn a new way of thinking and practicing that enough to be able to do it without thinking about it. We KNOW people can do this; most (probably all) of the people in this room can read. You can read without thinking about it, and you take that ability for granted.
The people Brian Reid was talking about were the declarative markup equivalent of my non-readers. They were smart, they probably started as willing users of this new computer-aided writing tool, but even if they understood the premise behind it (and I suspect some of them did), they hadn’t internalized the concepts of generic markup.
I have spent a lot of time with XML users, or would-be XML users, who have a similar experience. We spend a lot of time with them, learning what the parts of their documents are, and selecting, customizing, or occasionally writing, a vocabulary appropriate to their documents.
It is not unusual for a group of subject matter experts and professional writers,
when asked to identify the parts of one of their documents, to start talking about
the format of the document. What is this?
I ask. The answer is sometimes Times 24 Bold
or Head 1
or Bell 24
. No,
I ask, not just the beginning of the thing I circled, the whole thing.
Head 1 followed by several paragraphs
is the usual answer. If I push it, I can often get Head 1, paragraph, paragraph, Head 2, paragraph, paragraph
. And a room full of people who don’t understand why I am being so dense.
With just a little coaching they can, or they learn in the process of doing document analysis, to identify structures: sections, titles, lists, list items, and footnotes. They learn to name, define, and identify in documents subject matter that is important to them and their activity: drug name, ferrous alloy, terrestrial location, ammunition caliber, street address, conference start date. I think of this as the equivalent of learning the alphabet.
Like learning the alphabet, it is necessary to learn to see structures and subject matter content in your documents. Like learning the alphabet is no place close to sufficient to enable reading, learning how to name structures in documents is no place close to sufficient to enable writing in a structure-based tool.
Note: this is a sufficient level of knowledge to enable tagging existing documents. But if authors with this level of understanding are expected to produce declaratively marked-up documents, we are expecting them to write without the markup and then go back and add it. We are adding a time-consuming process to the act of writing. One that they don’t see as integral to the process of writing, and that in fact is not integral to the writing as they are doing it.
Worse than that, we are asking them to do a process that is rife with negative feedback.
It is all about errors and warnings. In many structured document editing environments,
the most positive feedback you get is silence. And sometimes it is difficult to tell
the difference between Victory; you did it,
Still processing,
and Ooops, application crashed, possibly because of your bewilderingly bad data.
Once the leap is made to thinking about what you are writing as the structures it is, this becomes not just habit-forming but addictive. I have a colleague who can no longer write with a simple text editor without screaming at it. She wants, or perhaps needs, to identify sections, headings, lists, and such as they are created. She prefers to write in a model-driven XML editor, but can set up word processor styles to meet the need. She wants to identify a code block, not specify 10 point courier indented 3 m-spaces. She thinks of it as a code block, not as what it might look like. She thinks of list items as items in a list, or in a nested list, not as starting with a solid bullet or a hollow bullet; not about how far they are indented on the page. Amongst the people in this room, I don’t think that is unusual. Amongst the literate people on planet earth, it is very unusual indeed. Even amongst the people who write using computer-based tools (don’t forget that many people still write using ink on paper), this is a very unusual point of view.
I believe that Brian Reid was talking about people who have learned the alphabet
of structured documents but who have not learned to read
them. People who are sounding out one word after another, thinking about reading
instead of thinking about the content of the document, and struggling with the process.
Those people, with that level of comfort with declarative markup, will never adopt
it. They cannot.
But I don’t believe that this is Game Over!
Just as those farm hands, cooks, and mechanics could make the leap from the alphabet
to sounding out words to reading, so can the authors of today make the leap from presentation
to identifying structures in existing texts to composing thoughts in structural terms.
I did it the way most of you probably did. Through exposure, and repetition, and working with systems I was fighting tooth and nail but had to use anyway. It was actually a fairly discouraging process, but it happened so long ago I can barely remember it. I can completely understand someone trying to write a document refusing to use a tool that forces them to stop thinking about their subject matter and think about something unfamiliar in the process of trying to capture their thoughts on some topic.
Declarative Markup is A Good Thing
I believe that we, as a society, would be better off if many, perhaps most, of our documents were encoded with declarative markup. It would make them more discoverable, more accessible, and probably better organized and understandable.
I believe that there is information about many documents that the author knows that would add significant value to the document if it were captured when the author creates the document. Some of this information can be added later, usually at significant cost. Some of it is simply unavailable once the document is separated from the author. (As an example, it is possible for third parties to write descriptions of graphics, but they cannot be sure that they describe the aspect of the graphic that is the most important point the author wanted to make in using that image.)
How Do We Get There?
The success of that adult literacy program I talked about was based on three things:
-
Motivation
-
Instruction on fundamentals
-
Absorbing first materials
Motivation. There are a few, very special, circumstances, in which people have the motivation to learn to write in a declarative way. I have worked with helicopter pilots turned helicopter documentation specialists who learned to write in an SGML editor. There are professional editors who work comfortably in grammar-driven editors, and technical writers, and people who are working with very stable structured documents of many types.
But for most writers of most documents, there is no reason to care. Even if their documents will be published in several media, and even if they would be more discoverable if they were better structured, that is generally hidden from the author.
They will only make the investment in learning something new if we make the reasons clear. No, if we make the reward for overcoming the significant hurdle in the path more than worth the effort.
I suspect that that reward will have to vary from user community to user community. I worked with the support-analysis group of a very large organization. Decision-makers would ask this group to research topics of interest. Sometimes they wanted a one-page summary of options and advantages and disadvantages of each, sometimes they wanted in-depth studies that looked like research reports or even dissertations. Sometimes they specified that they needed the information within the next few months, but more often they wanted it as soon as possible. Using the word-processing based systems, there was usually about a 3-day lag between completion of the analysts’ work and presentation of the document to the requestor. Once we brought in the markup-based application, the analysts had a choice: they could continue to use their familiar word processor, and the publication team would convert the content and format it in the new tools — which would take about 2 days. Less than the old system; this was a win! But if they used the new authoring environment and created marked-up documents, their content would be approved by legal and formatted for delivery within 24 hours. Sometimes faster than that. Low and behold: most of them came to the classes on how to use the new editor, and most of them learned to use the new tool right away. By several months in there was only one hold-out, and when it became clear that the users preferred the other analysts because their documents were not only delivered more quickly but were also better organized — that one asked for private coaching in the @#(*&^ new system.
Instruction on fundamentals. Well, we have a fair amount of material for this. I don’t think most of it is appropriate to most users. I say this as someone who has written, and taught, classes on SGML and XML for years. We need better instructional material. I’ve seen some that was much worse than mine, but I haven’t seen any that really impressed me. It would be good if we figured out how to teach this stuff.
But I don’t think that is really the problem; the amount and quality of instruction on the alphabet and pronunciation in my literacy class was almost embarrassing. That didn’t matter.
Absorbing first materials. What we REALLY need is the functional equivalent of a Mickey Spillane novel for beginners with declarative markup.
Not reading material, but some instant gratification application that works quickly and gracefully if provided with well marked-up text. I don’t know what that application is. I charge you to start thinking about it. Actually, I hope to see several, dramatically different, applications that provide instant value for the effort of explicitly marking up the structure and content of documents. It is these applications that might, just might, help a significant number of document creators over that big barrier to graceful and comfortable use of declarative markup.
With luck, the presentations and conversation here at Balisage will help one or more of us create just that barrier-breaking tool.
Appendix A. Presentation Slides from Brian Reid’s Markup Technologies ’98 Keynote Address
References
[Reid, 1998] Reid, Brian. 20 years of abstract markup - Any progress?
Keynote Address at Markup Technologies ’98, Chicago, Illinois, November 19 - 20,
1998. http://xml.coverpages.org/mt98-papers.html. Presentation slides available at http://www.reid.org/~brian/markup98.ppt.
[Spillane, 1951] Spillane, Mickey. The Big Kill. 1st ptg. edition. New York: New American Library (Signet Book #915), 1951. Available at https://www.amazon.com/Big-Kill-Mickey-Spillane/dp/0451093836.