Introduction
From Cassiodorus’s sixth-century Institutiones, instructions to his monk-scribes:
Jerome arranged his translation of the entire divine authority … into cola and commata so that those who have difficulty in understanding the punctuation of sacred letters might, thus assisted, pronounce the holy text without error ....
Place in each chapter the punctuation marks that the Greeks call thesis, i.e., small round points … since they make the written text clear and bright when … they are fitted in their place and shine forth. How excellent it is to pass unhindered through holy thought and to enter subtly into the sound nature of its precepts … and to divide the whole composition in parts in such a way that it is beautiful when regarded in its sections!
For if our body must be known through its limbs, why does it seem right to leave reading confused in its arrangement? These positurae, or points, like paths for the mind and lights for the composition, make readers as teachable as if they were instructed by the clearest commentators.[1]
Presentational markup was introduced as a category of markup in
1987 by Coombs et al. to refer to the renditional features of a presented document,
such
as the arrangement of words on the page and text size or style. It was intended to
as a
complement to descriptive markup (e.g., <title>
) and procedural markup (e.g.,
<center>
), two markup categories that had been introduced some years earlier by
Charles Goldfarb.
In any general account of textual communication, presentational markup plays a critical role: it accomplishes the recognition of intended content in the very last phase of the communication scenario, not only by making communication more efficient and more reliable, but also by determining the received meaning of the text. This determination is not limited to simple identifications and disambiguations; presentational markup also operates more globally to create distinctive cultural meanings. A broad historical view of textual communication reveals that choice of script, type, image, and physical platform are interpreted differently by different audiences and that books that transmit the same words do not always transmit the same story (Mak 2011).
Every manifestation of a text includes presentational markup. Such markup is critical to its reception and its ability to be received. Yet it is descriptive markup that has almost completely dominated the attention of SGML/XML markup theorists. Over the last forty years, the markup community has explored new subcategories of descriptive markup, syntactical innovations to accommodate non-hierarchical and discontiguous objects, systems for attaching formal semantics, the comparative virtues of different schema languages, the comparative virtues of different query languages, and so on. A comparable theoretical perspective on presentational markup is almost entirely absent from this considerable body of research.
Markup
Prior to the emergence of electronic publishing and text processing, the word
markup
was commonly used to refer to proofing marks and instructions
to compositors that were written on a manuscript, typescript, or a preliminary proof.
In
the 1970s, the term markup
began to be used for the specialized codes or
expressions that were included in the textual data files and controlled the formatting
carried out by word processing or typesetting systems. These codes had same general
purpose as the proofing marks or instructions to compositors, but were intended to
be
processed by computer software rather than acted upon by human compositors or editors.
The origins of the current widespread use of the term markup
in this
sense – namely, instructions for software – can be traced to IBM’s Generalized Markup
Language (GML) (IBM 1978).
Charles Goldfarb, GML’s co-developer and later editor of the SGML standard, offers a characterization of software-oriented markup in his influential 1981 SIGPLAN paper, which was later included as Annex A of the SGML standard:
Text processing and word processing systems typically require users to intersperse additional information in the natural text of the document being processed. This added information, called
markup,serves two purposes: 1. it separates the logical elements of the document; and 2. it specifies the processing functions to be performed on those elements (Goldfarb 1981).
The subsequent characterizations of markup that have been offered over the last forty years remain broadly consistent with Goldfarb’s 1981 characterization. With little variation, markup is described as (i) providing information about the text, (ii) being included with the text, and (ii) not being part of the text. We will refer to this as the standard definition of markup.
Presentational Markup
In pursuit of a general theory of markup, and one that would provide a useful context
for promoting descriptive markup in the text processing community, Markup Systems
and the Future of Scholarly Text Processing
(Coombs et al. 1987) endorsed Goldfarb’s two markup categories and added four others. Among the added
categories was presentational markup, which referred to the
renditional (or presentational) features of the formatted
document:
In addition to marking up lower-level elements with punctuation, authors mark up the higher-level entities in a variety of ways to make the presentation clearer. Such markup — presentational markup — includes horizontal and vertical spacing, folios, page breaks, enumeration of lists and notes, and a host of ad hoc symbols and devices.
For Coombs et al., presentational markup is a natural part of all textual communication:
authors have long performed presentational markup in their manuscripts and typescriptsandwhenever an author writes anything, he or she ‘marks it up’.[2]
As an example of text with no presentational markup, they supply a passage with no interword spaces (scriptio continua). But, in this example, presentational markup is paradoxically present in its apparent absence. That is, the use of continuous script can be a strategic choice. Whereas word spacing supports the observation of grammar by visually dividing words, scriptio continua facilitates the observation of meter or rhythmic structure, which is critical to the oral performance of verse.[3] The metrical unit of the colon is a clause composed of around eight to seventeen syllables, which means that its terminus may or may not coincide with a word-ending. Scriptio continua is thus not the absence of presentational markup, but is itself presentational markup. It is a way of rendering the text that facilitates the recognition of features of interest, and therefore must be considered presentational markup. The apparent lack of presentational markup is, in fact, presentational markup.
Coombs et al. also note that the concept of presentational markup is applicable to other modalities of communication, such as speech:
When we
translatewriting into speech (i.e., when we read aloud), we do not normally read the markup directly; instead, we interpret the markup and use various paralinguistic gestures to convey the appropriate information. A question mark, for example, might become a raising of the voice or the eyebrows.
Markup in this sense is not unique to digital processing, or even to specific institutions of textual production, such as those with human editors, designers, proofreaders, and compositors who need to communicate with one another during the production process.
One might divide presentational markup into two broad categories. Whenever text is presented there will be, necessarily, renditional features that are part of that presentation. The characters must be some size or other, the lines some length of other, the type some style or other, the script in some hand or other. Although the wider cultural context, as well as local circumstances, can give these features considerable significance, they will often appear to have been chosen simply to improve, in a general way, the efficiency and accuracy of the reading experience: the type is large enough to read, characters of the hand easy to discriminate, the line length optimal, running headings useful, and so on.
On the other hand, when renditional features are used to markup the higher
level entities
(such as titles), they are not simply improving general
legibility. They are also communicating the existence of particular textual objects:
titles, extracts, author names, formulas, proofs, theorems, verses, and so on. In
what
follows, the focus will be on this latter use of presentational markup.
The Meanings of Presentational Markup
The original sense of presentational markup, renditional features themselves, is no longer the most common sense. This situation is particularly confusing as Markup Systems (Coombs et al. 1987) is cited as the source for various definitions of the term, even when the sense given has no similarity at all to the sense provided in Markup Systems. These variant senses and misattributions have contributed to the undertheorizing of presentational markup in the markup community.
The phrase presentational markup
is now primarily used for instructions
that specify how text is to be processed, such as
, which
instructs a formatter to center the enclosed textual content. That is,
<center>
presentational markup
now refers to what Goldfarb (1981) and others
call procedural markup. This redundant variant sense of
presentational markup
was probably inevitable. For one thing the
phrase presentational markup
itself easily allows this interpretation. In
addition, the widespread recognition of the paramount importance of the contrast between
markup that identifies text elements and markup that specifies formatting made the
interpretation a useful one in the circumstances. Finally, the notion that renditional
features might be considered markup is challenging and unexpected, which probably
further disadvantaged the original sense. Regardless of how this shift began, it has
been sustained by the prevalence of semantic markup
and
presentational markup
as opposed terms referring to the descriptive
and procedural markup of the HTML markup language.
A second variant sense derives from a series of definitions given in the Wikipedia article on Markup Language. By August of 2005, the Wikipedia article included definitions of descriptive, procedural, and presentational markup, citing Markup Systems as its source. The account given for presentational markup was a reasonable interpretation of the sense given by Markup Systems:
Presentational markup expresses document structure via the visual appearance of the whole text of a particular fragment. For example, in a word processor file, the title of a document might be preceded by several newlines and spaces, thus accomplishing leading space and centering … (Wikipedia, August 24, 2005)
However, the revisions made to this definition in 2009 created an entirely new variant sense:
Presentational markup is that [sic] used by traditional word-processing systems, binary codes embedded in document text that produced the WYSIWG effect. Such markup is usually designed to be hidden from human users .... (Wikipedia, July 29, 2009)
This version omits the accurate lead sentence of the 2005 definition, takes the word
processing context as characteristic of presentational markup, and identifies
binary codes
as the presentational markup itself.
A third but less common sense of presentational markup
refers to
descriptions of realized renditional features. This sort of markup is most likely
to
occur in transcriptions of culturally important texts where it records the occurrence
of
such things as italics, line breaks, or other features that may of interest to scholars
studying those texts. Like the previous variant sense this is also a plausible
interpretation of the phrase considered in isolation. It also reflects a weakness
with
the basic descriptive/procedural distinction which covertly yokes together two different
features that a markup category can have: illocutionary force (e.g., descriptive v.
imperative) and semantic domain (e.g. logical element v. rendering), without recognizing
and accommodating the possibility of independent assortments, let alone other features
and other values (Renear 2000).
Is Presentational Markup Really Markup?
The use of renditional features to mark up higher level entities
such
as titles seems to satisfy the standard definition of markup: renditional features
such
as centering and italics inform the reader that a certain bit of text is a title;
those
features are not part of the text (they are not themselves textual in nature); and
they
are included with the text because they occur combined with the text in the rendered
presentation.
Of course, the standard definition of markup is subject to interpretation and revision. So it is equally important that there is a general rationale for treating renditional features as markup, for seeing a concept of markup that includes renditional features as useful for reasoning about textual communication. The inclusive conception of markup does appear to capture phenomena that are fundamentally similar even though superficially different and that play relevantly related coordinate roles in textual communication.
Hesitation about classifying renditional features as markup is typically based on one or both of two reasons:
The first is that descriptive and procedural markup, as well as proofing marks and notes to compositors, all appear to be occurrences of expressions in a language. Descriptive and procedural markup are typically composed of alphanumeric characters with delimiters and associating punctuation, a vocabulary of lexical items of different logical types (often making use of familiar natural kind terms and mathematical expressions), an explicit or implicit generative grammar, referential and characterizing features and some sort of compositional semantics, formal or implied, just as we find in typical natural and artificial languages. By contrast, renditional features do not appear to be occurrences of expressions in a language.
Nevertheless, renditional features perform a communicative role: the layout of textual
elements informs the reader that such and such is the title of the article, that so
and
so is the name of the author, that a citation is the source for a sentence, and so
on.
Most importantly, renditional features are not just evidence for
these things (as smoke is evidence for fire). They are social conventions intended
to
cause the reader to recognize the existence and identity of textual elements, to,
e.g.,
understand a bit of text to be a title. The reader is also intended to recognize this
intention and, in addition, to recognize that that recognition was
itself intended. This double intention and recognition of intention is, in Gricean
semantics, at the heart of what we mean by meaning
(Grice 1957). So although renditional features may not be part of a language
in exactly the same sense that descriptive and presentational markup are part of a
language, renditional features are like descriptive and procedural markup in being
part of a symbolic
system for the intentional communication of information.
Another reason one might hesitate to consider renditional features as markup is that renditional features are typically intended to be directly perceived by a person, whereas procedural and descriptive markup are typically intended to be processed by a software application. Although these distinctions are certainly important, they do not seem to warrant abandoning the more general concept.
Additional classifications can be made, and contexts of use will create certain assumptions about the domain of application, or implied specialization, but, again, those are not reasons to abandon the general concept expressed in the standard definition. It may of course be surprising that the general definition counts renditional features as markup, but then we are often surprised by the extensions of our natural kind terms.
Nevertheless, it is not clear that these considerations are decisive. Perhaps the strongest argument for not classifying renditional features as markup is that renditional features are really part of the text. As noted above, renditional features are similar to the natural language sentences of the text in that they also are informing the canonical reader. Meaning might be understood as emerging from the relationship between, for instance, the centering of a phrase and the phrase that is centered, and perhaps it is that ensemble, and not just the phrase alone, that should be understood as the text.
For now, though, we will continue to refer to renditional features as presentational markup.
So What is Going On?
Presentational markup facilitates the recognition of textual objects like titles and
extracts. These objects have been referred to above as logical elements
(Goldfarb 1981) and higher level entities
. In this
section we will refer to them below as content objects
(Derose et al. 1990). Our question is: how, exactly, does presentational markup facilitate
the recognition of content objects.
The Simple Description Account
We begin with the characterization suggested above: presentational markup is a system for describing, or communicating, the existence and identity of content objects. If this is right, then presentational markup collapses into descriptive markup as far as illocutionary force is concerned (they both describe) although remaining distinctive in other ways: presentational markup is intended to support the canonical reader and so has characteristics specifically appropriate to that purpose.
A possible objection to this account is that it is inconsistent with the experience of reading. Imagine someone reading a book about whaling. On the simple description account the book contains both presentational markup that identifies and relates content objects, and natural language sentences that make zoological claims about whales. Is the reader simultaneously reading about content objects and also reading about whales? Or perhaps reading oscillates between the presentational markup identifying content objects and the narrative sentences of the text; the reader then combines these to form a fully realized understanding of the content being communicated. Simultaneous conscious consumption of narrative sentences and presentational markup would be avoided, but it might still be objected that this oscillation is inconsistent with the common experience of reading.
Although beliefs about our immediate experiences are sometimes regarded as being relatively privileged epistemically, we do not assume in what follows that our beliefs about the reading experience are accurate. Rather the line of reasoning presented explores how a supposed inconsistency might be accommodated.
The Simple Performative Account
A performative interpretation of the role of presentational markup may partially address the objection from reading experience. According to this view, presentational markup is not, strictly speaking, describing something in the sense of making a true or false assertion, but it is creating something. The presentational markup for a title creates a title; the presentational markup for a block extract creates a block extract. Or, alternatively, one might say that the presentational markup creates the textual elements title, and extract.
In Austin’s familiar example of promising, a person who utters the first-person
present tense sentence I promise …
is not
describing anything — they are
promising something. That is, they are not making a claim
that could be characterized as a true or false assertion about how the world is, as
they would be if they had said in the past tense Yesterday I promised …
or in the present tense but of someone else She is promising …
. By promising they are creating an obligation,
not describing one. On this account, presentational markup is a language for
creating textual states of affairs, not describing them. The presentational markup
for a title, for instance, accomplishes the titling of a document.
Nevertheless, because performatives still involve some form of propositional
communication, they may not seem to directly address the objection from reading
experience. Even if no proposition is asserted, by the
promiser, the audience still engages with the propositional content of the uttered
sentence I promise …
. On a performative interpretation of
presentational markup, the situation would seem to be similar. The reader will
recognize presentational markup as expressing, even though not asserting,
propositional content. In this case the content would be This is the
title: …, only now that content is intended as a declaration, not a
description.
Unconscious Awareness
Perhaps the descriptions expressed in presentational markup are processed at a different cognitive level than the descriptions of whales. This processing need not be mysterious. When we return from a walk we can typically respond correctly to a very large number of questions about what we saw: gravel, asphalt, curbs, steps, grass, oak trees, roses, litter, steps, handrails, automobiles, and so on. Given both the likely number and extreme variety of these easily answerable questions it is improbable that in every case the corresponding concepts were in our occurrent consciousness at some point during the walk. Moreover, we would probably deny that we had any thoughts at all about most of those objects during the walk. Yet in some sense we were aware of those things — otherwise we would not be able to correctly respond to questions about what we saw. Moreover, we could not have succeeded in navigating our way around branches, over curbs, up steps, grasping handrails, and so on if we were not aware of these things, and many more besides, even though, again, we had no conscious thoughts about them.
On this account the objection from reading experience is blunted because while we are consciously aware of the assertions about whales, we are only unconsciously aware of the assertions about textual objects (e.g., this is a title). This seems consistent with the fact that someone may report that they learned the title of the article they just read even though this is the title was never a proposition in their occurrent consciousness. The reader recognizes that the title is a title, and recognizes that it is title because they see that it is bold and centered, but none of the concepts title, bold, or centered, were, necessarily, present in the reader’s occurrent consciousness.
Non-Propositional Experiences
A question arises though. When we are unconsciously aware of, e.g., a title, are we really, even unconsciously, seeing that some text is a title? This would seem to be the case if presentational markup is a system for informing us of the existence and identity of content objects, even if it is creating the things it is reporting. But, again, having so many propositional beliefs, even if unconscious, may still seem like too much cognition.
Perhaps our engagement with presentational markup is not only unconscious, but also fundamentally non-propositional in nature. The reader sees some centered text as a title, and they see that text as a title because they see it as centered. But they do not see that the centered text is centered, and then reason from that recognition to the further conclusion that the centered text is a title. On this account presentational markup continues to be causally involved in communication by creating a certain experience, seeing some text as a title), but not by expressing the proposition that some bit of text is a title. An awareness that is both unconscious and non-propositional might provide the sort of background experience we generally associate with presentational markup.
None of this prevents the reader from subsequently reflecting on their unconscious non-propositional experiences and acquiring the occurrent propositional knowledge that some text is a title. And of course, this also does not prevent the reader from retrospectively explaining their original experiencing of some text as a title by referring to the relevant presentational markup. The reader is correctly recognizing that the markup had a causal role in creating the experience, and the subsequent belief, and if they choose to (incorrectly) represent this experience as their inferring that some text is a title on the basis of their seeing that the text is centered and bold, we can forgive them a convenient fiction.[4]
Looking ahead
Puzzles abound. Here are just a few:
What shall we say when a compositor’s error leads to the author’s name being set as if it were the title? If the performative is effective then that’s the title — the wrong title, but the title nonetheless. Perhaps though it is an edition of the work that has the wrong title, and the work retains its intended title. A somewhat different case is when the topic of an essay is used as the title deliberately, and across many editions, as with many classical texts e.g., In Verrem.
Full stops and capital letters help us see orthographic sentence boundaries — or do they create (orthographic) sentence boundaries?
What should be considered part of the writing system? Presentational markup?
Presentational markup goes beyond facilitating efficient and reliable reading and
often makes a determining contribution to the identity of textual objects: an extract
is
seen as an extract and not part of the preceding text by the indentation, or the last
section in a chapter is seen to be a section and not a subsection of the previous
section because of how its heading is set. However, cases where the contribution is
at
the sentence level and determines the proposition expressed by a sentence may stress
our
sense of categories like language, writing system, or markup. Consider the sentences
like She married him?
vs She married
him?
Varying the emphasis varies the question just
as varying the verb would.
We have been tugging at just a string or two in a large daunting snarl of intricate problems, elusive concepts, and shifting categories. You are invited to help with the untangling. We hope we have followed Cassiodorus’s advice and left you at least a few positurae, paths for the mind and lights for the composition.
References
[Cassiodorus] Cassiodorus. Institutions of Divine and Secular Learning, and, On the Soul. Translated by James W. Halporn. Liverpool: Liverpool University Press, 2004.
[Coombs et al. 1987] Coombs, James H., Allen H. Renear, Steven J. DeRose. Markup Systems and the Future of Scholarly Text Processing.
Communications of the Association for Computing Machinery, 1987, 30 (11), pp. 933-947. doi:https://doi.org/10.1145/32206.32209.
[Derose et al. 1990] DeRose, Steven J, David G. Durand, Elli Mylonas, Allen H Renear. What is Text Really?
Journal of Computing in Higher Education, 1990, 1, pp. 3-26. doi:https://doi.org/10.1007/BF02941632.
[Goldfarb 1981] Goldfarb, Charles. A Generalized Approach to Document Markup.
Proceedings of the ACM SIGPLAN SIGOA symposium on Text Manipulation, 1981. doi:https://doi.org/10.1145/800209.806456.
[IBM 1978] IBM. Document Composition Facility: Generalized Markup Language (GML) Users Guide. IBM General Products Division, 1978.
[Genette 1997] Genette, Gérard. Paratexts: Thresholds of Interpretation. Cambridge: Cambridge University Press, 1997.
[Grice 1957] Grice, H.P. Meaning.
Philosophical Review, 66 (3), 1957.
[Mak 2011] Mak, Bonnie. How the Page Matters. Toronto: University of Toronto Press, 2011.
[Nagy 2000] Nagy, Gregory. Reading Greek Poetry Aloud: Evidence from the Bacchylides Papyri.
Quaderni Urbinati di Cultura Classica, n.s. 64, no. 1, 2000: 7. doi:https://doi.org/10.2307/20546621.
[O'Donnell 1979] O'Donnell, James J. Cassiodorus. Berkeley: University of California Press, 1979.
[Parkes 1993] Parkes, M.B. Pause and Effect: An Introduction to the History of Punctuation in the West. Berkeley, Calif.: University of California Press, 1993.
[Parkes 2008] Parkes, M.B. Their Hands Before Our Eyes: A Closer Look at Scribes. Burlington, Vt.: Ashgate, 2008.
[Renear et al. 2002] Renear, Allen H., David C. Dubin, C. Michael Sperberg-McQueen. Towards a Semantics for XML Markup.
Proceedings of the 2002 ACM Symposium on Document Engineering 2002, pp. 119-126. doi:https://doi.org/10.1145/585058.585081.
[Renear 2000] Renear, Allen H. The Descriptive/Procedural Distinction is Flawed.
Markup Languages 2 (4), Fall 2000.
[Saenger 1997] Saenger, Paul. Space Between Words: Origins of Silent Reading. Stanford, Calif.: Stanford University Press, 1997.
[1] The first section, on Jerome, is an excerpt from I.12.4. The second section is an excerpt from I.15.12. See Cassiodorus and O'Donnell 1979. Cola and commata refer to the division of text according to metrical unit. See Parkes 1993.
[2] Although Goldfarb does not have a markup category for either punctuation
or renditional features such as text size or style, he observes that
formatting programs interpret spaces and punctuation as implicit
markup
in order to recognize such elements as words and
sentences.
Coombs et al. argue that as both punctuation and
renditional features have material manifestations they are no more implicit
than descriptive and procedural markup.
[3] Nagy 2000. See also Saenger 1997.
[4] Both unconscious awareness and non-propositional perception are, of course, topics of interest for psychologists and philosophers, and the problems raised here for presentational markup are part of the larger general project of explaining linguistic communication, whether textual or oral.