DeRose, Steven. “What is a diagram, really?” Presented at Balisage: The Markup Conference 2020, Washington, DC, July 27 - 31, 2020. In Proceedings of Balisage: The Markup Conference 2020. Balisage Series on Markup Technologies, vol. 25 (2020). https://doi.org/10.4242/BalisageVol25.DeRose01.
Balisage: The Markup Conference 2020 July 27 - 31, 2020
Models for XML documents often focus on text documents, but XML is used for many
other kinds of data as well: databases, math, music, vector graphic images, and
more. This paper examines how basic document models in the text
world, do and do not fit a quite different kind of data: vector graphic images, and
in particular their very common application for many kinds of diagrams.
SOCRATES: Suppose that when a person asked you...
either about figure or colour, you were to reply, Man, I do not understand
what you want....Meno, he might say, what is that simile
in multis which you call figure, and which includes not only
round and straight figures, but all? Could you not answer that question,
Meno? I wish that you would try; the attempt will be good practice with a view to
the answer about virtue....
—
Plato, Meno
XML in the text world
Much XML usage involves text documents, where a large part of the content
consists of readable natural language sentences (though a large share also include
non-text parts[1]). We have a fairly well-developed understanding of how to model text
documents and of the tradeoffs involved. But XML documents are not all primarily
text; some have little or no text at all. This paper examines how common document
models we use in the text world fit one quite different kind of data:
vector graphic images, and in particular their very common application for many
kinds of diagrams. SVG is a common XML schema for such data, and I will use it in
many examples; but other schemas are also available.
Our models for text documents are not only orthographic but also conceptual,
categorizing parts of documents in terns of their rhetorical or broadly semantic
significance. Some such models enumerate essential features of documents: properties
whose equivalence seems necessary and sufficient for two artifacts to be considered
the same document for various purposes, roughly glossed as being the same work.
Renear and Dubin (2003) provide a cogent analysis of such sameness conditions.
Different models serve different needs, from printing to paleography to genre
studies. Some of the essentialist models are associated with the shorthand
OHCO, an acronym based on the claim:
(1) A document is an ordered hierarchy of content-based objects.
Order in (most) text documents seems largely uncontroversial.
Hierarchy merely notes that parts posited in a document can nest.
Content-based means that the objects' categories (ideally)
reflect conceptual significance rather than layout (for example
heading rather than big bold Helvetica). Content-based expressed that the objects posited in the text are of types intended to reflect the
natural (or natural-language) kinds found in the text itself, plus tables, illustrations,
etc. when more than
decorative. In these models, the focus is on
assigning document parts to classes (categories) based on their rhetorical or
ontological significance, not appearance.
Objects is reminiscent of Object-Oriented programming, but the
usage is not identical.[2]
One similarity is that document components fall into classes, which are (a)
defined for each application but (b) commonly mirror real world conceptual
categories. Document objects also typically have data such as XML attributes and
content, as well as self-identity and connections (interaction) with others.
Like in OOP, document objects have behaviors, which are largely associated with
classes but commonly refer to properties of instances. Behaviors may be thought of
slightly differently, however: Document systems often isolate them from data, on the
rationale that a given class may need wildly different behaviors in different
contexts. A date object may be displayed in a special font and color when
printing, but spoken in a distinct tone by a speech interface. It may lead to wildly
different displays in a calendar tool; and it may be treated in yet other ways for
a
retrieval system, or systems not yet imagined. Still, the fact of being a date,
personal name, heading, abbreviation, quotation, or cross-reference leads to some
behavioural expectations.
For most objects in very generic schemas as discussed earlier, the assignable
behaviors are commonly chosen from a small set largely involving font, color,
margins, and wrapping. A few types require less generic but still conventional
support: footnotes, index entries, auto-numbered lists, links, and so on.
Other much more specialized schemas commonly require specialized rendering and
other behaviors. Forms, MathML, SVG, and instrument telemetry schemas fall in this
class: rendering them with generic features such as font choices or colors is much
less useful than rendering abbreviations, emphasis, foreign words, and defined terms
using only those same choices.
Objects in documents may tend more strongly to be ordered than is typical in
OOP, and make different use of hierarchy. OOP typically organizes object classes
into is-a (ontological) subclasses, which inherit both attributes and methods. Some
OOP classes also organize instances into has-part (mereological) structures, but
many do not. Document models tend to reverse that focus: content objects are nearly
always nested, creating mereological structures, while the ontological relationships
between content object classes may be left to documentation, left implicit, or even
left unconsidered.
That is not to say ontology is absent or unimportant. The field treats of
superclasses all the time: lists, sections, links, form elements, block vs. inline
elements, soup, and so on. Many schemas developed conventions to
express subclasses: In SGML this was often done via parameter entries grouping
various elements or attributes and providing a name, such as "heading" in general,
or "soup". HyTime architectural forms, XML namespaces, HTML class attributes, and
XSD complex types do similar service.
XML and schema languages provide no means of determining whether elements are
content-based, either for text or for diagrams. However, OHCO and
similar models argue that such elements express more essential properties, and
document representations that use them better support preservation, update, and re-use.[3]
Modelling levels
OHCO and similar models of documents are often contrasted with 3 other levels of
abstraction:
Image representations, where even the
fact that some appearance is an instance of the letter e
requires an interpretative act;
Grapheme representations, where
orthographic symbols are explicit, including spaces and punctuation that
mark a few kinds of components. Symbols form a very small vocabulary, and
are rarely generatively combined or coined. This level represents very
limited objects (although conventional additions commonly to creep
in).
Layout representations, where content is
grouped into geometric objects such as pages, columns; and text boxes
(possibly auto-wrapped, possibly absolutely positioned). Some regions may
correspond to content-based objects, but they are not those things per se --
for example, a rhetorical paragraph may be broken across regions (even
pages), with other regions geometrically but not logically between.
What I will call Object representations
or descriptive markup, where content is grouped into larger
meaningful units as described earlier.
Coombs et al. (1987) refer to procedural markup rather than
layout, focusing on the instructions often used to create a
layout, as distinct from the layout itself. Changing margins, font, justification,
etc. can be performed by selecting individual scopes and setting properties in a GUI
(like many word processors); inserting program-like command (as in troff); or
choosing a canned layout. The common factor is focus on physical layout for a target
medium. Fundamentally, people choose layouts to convey conceptual distinctions (as
well as for aesthetic reasons), not the reverse: one does not simply walk into text
turning layout into quotations, stage directions, or chapters because of appearance
– rather the reverse.
The levels of abstract models differ along in a variety of specific features. One
feature to note about this taxonomy of models is that the objects used at each level
are drawn from different domains with very
different cardinalities, and whose members typically combine in quite distinct ways.
Respectively, the domains can be glossed as (1) colors; (2) orthographic symbols;
(3) symbols plus regions such as rectangles; and (4) rhetorical and
semantic categories.
A second feature is that these levels all order
their objects, but in different ways: 1 and 3 are dominated by spatial placement,
while 2 and 4 are dominated by nested linear orderings (which only sometimes
correspond to space).
A third key feature is the sort of
meanings the objects bear.
At the Image level, the units of representation are pixels or particles of ink,
which have practically no meaning singly. But all the higher levels involve
signs.[4] Orthographic systems seem to lack features aimed directly at
representing the visual regions of the Layout level, or the amorphous visual atoms
of Image level -- unless perhaps one counts lead type sorts.
A fourth key feature is whether objects are composed of other objects (not counting hierarchy in the grammar of
natural language itself).
Pixels per se form rows and columns, but many image technologies do not use them.
Graphemic punctuation can express a few nestable categories such as word, sentence,
paragraph, and quotation[5] A Layout representation may be hierarchical or not. TEX can stack boxes
generally, and CSS has a nested box model, recursive table layouts, and absolute
positioning (thus enabling nearly any geometry). The Object level differs by
supporting hierarchical structures in general, rather than particular (linguistic
or
geometric) ones.[6]
The Object level can represent objects that are not as easily excluded as part of
language. Authors often mark objects that are distinct in their content but not in
the linguistic system per se. For example, even users of FRESS would create macros
for components such as axioms, lemmas, etc. DocBook similarly
distinguishes many components of technical manuals, such as key-names, option
values, and so on.
Sometimes XML imposes hierarchy unnecessarily or unambiguously. For example, a
phrase that is both a foreign phrase and a definitional use of a term could be
tagged with either on the outside, intending no difference in meaning. Few
document representations other than LMNL and standoff markup directly provide for
truly coterminous elements, but doing so is not especially difficult (e.g.
<foreign+defn>). This edge case arises even in layout-oriented schemas that
differ in whether they provide only bold and italic, or also a combined 'bolditalic'
construct.
Beyond hierarchy
Natural language has numerous non-hierarchical and even unordered phenomena, such
as pronouns, coreference, alternative word orders, interrupted dialog,
cross-reference, and multiple attachment of noun phrases that serve roles in both
a
main and a subordinate clause. In addition, there are often multiple perspectives
to
annotate, which do not nest neatly with each other. Adequate document models must
give some account of such phenomena as well.
Syntax vs. model
It is well known that many general syntaxes can represent data even beyond their
design spaces. A pixel can be modeled as field. Anything in computer storage can be
modeled by a long string of bits (and, at bottom, is). As is important for images.
Any
arrangement of boxes and text can be represented by a list of a handful of basic objects
like
XML can be represented by a linear sequences of SAX-like events, which are in turn
easily represented even in CSV. XML is particularly good for documents not because
of syntax details, but because its native constructs map readily to document models
which have proven useful for serious work with non-ephemeral text documents. Here
we
will examine how and to what extent this holds true for diagrams as well.
The Essential Document
As mentioned earlier, the claim that some model is close to what text is,
really is largely a claim about essentials:
What remains across all versions of the same document?
What, if changed, suffices to make something no longer the same
document?
These questions depend on the purpose at hand: For some purposes, only the very same
physical artifact will do; for a slightly broader range of purposes, only a very
accurate facsimile. For many purposes, a document with the same graphemic text and
at
least roughly similar layout is fine; for example, layouts that enable the reader
to
infer the same categories for the same document portions. A large print edition,
different fonts, or another page size is not "different" for everyday purposes.
Consequent mechanical changes such as ligatures, soft hyphen use, and completely
different page decoration such as headers and page numbers, go a bit further. Even
renumbering figures, lists, or items may be a non-essential change.
The boundary between essential and inessential changes to a document can often be
illuminated through considering the visually impaired reader. A document printed in
Braille or read aloud, is for many purposes "the same document", even considering
Braille symbols that signal concepts more like typography than orthography. Similarly,
an audio text must in many cases represent emphasis and text object boundaries
somehow.
Because text can use the full range of natural language abilities to refer to itself
and its properties, it is easy to construct cases where even small changes become
essential. A document may refer to its own formatting (a book about typography may
do
this a lot), or an active reader may refer to layout details, as Fermat referenced
regrettably small margins. St. Paul even does this in manuscript: See with what large letters I am writing to you with my own hand. (Gal. 6:11). Thus, no property of a
text document is in principle non-essential.
On the other hand, even large changes can be inessential if they preserve consistency:
cross-references such as see page 27 or section VII, or guides such as indexes and
tables of contents. In short, changes must not impact readers' ability to perceive
intended assertions or distinctions.
For text documents, having the same sequence of text characters seems typically
essential, which is approximately the FRBR level of Expressions. Nevertheless, changes
to coded character sets, ligatures, hyphenization, modernizing away the English long
s
(ſ), or changing between Hiragana and Kanji or Latin letters and Braille, could change
every character in the text yet may (arguably) not make an essential difference.
Document structure components pose issues similar to characters: Changing the encoding
or markup, or even changing the schema to have completely different names but the
same
abstract meanings, seems as inessential as changing coded character sets. But merging
speeches in a play, turning a quotation into a plain paragraph, or messing with the
nesting relationships of sections, all seem essential.
Essentials
With respect to text documents, a key argument for OHCO-like models is that these
content-based objects are, like the words, essential to the text: they persist across
various expressions of the text, in a way that layout does not. For example:
Changing whether a given span of text is a quotation or not can radically change
the
meaning, for example by causing the reader to associate certain beliefs, sentiments,
or
actions with the wrong party.
Changing spacing or punctuation (which can be considered forms of markup) can change
meaning, as in We are now here vs. We are nowhere, or the oft-memed Let's eat,
grandma.
Losing the emphasis tag in World hunger is not a problem
completely reverses the meaning.
Cases like tagging <foreign> and <emph> both as <italic> also seem to
compromise the essentials of a document (even if that might not be visible to the
reader), much as substituting same-shape Greek or Cyrillic Unicode letters for Latin
ones seems misleading even though the result may be visually identical, or narrowly
transcribing a speaker who cannot pronounce certain phonemes distinctly.
Beyond such essentials, XML is commonly used to attach information that is not considered
part of the document
at all -- for example, part-of-speech tags or other linguistic annotations; metadata
that typical readers never see; red-lining or other representation of document editing
history.
Isomorphism
OHCO-like models speak of making content-based objects explicit. This involves assigning concrete names (or numbers, etc) to abstract notions
such as quotation or section, and concrete scopes to their instances. In a particular representation such as XML,
it further involves representing those named objects with actual syntax. At all these
stages, alternative choices are possible, meaning that multiple concrete expressions
can represent the same abstract structure. For example:
Most syntactic representations define some details as not being information: comments,
variations in how line boundaries are coded, order or certain things, etc. Canonical
XML provides such a definition for XML.
The same content objects can have different names and delimiters: p vs. para vs. parafo.
Information can be shifted between syntactic constructs, such as
attributes vs. elements in XML (TEI P3 sic vs.
corr being one example).
Schemas may differ in how, and how finely, they distinguish related element types.
div may distinguish levels purely by being nested, or the schema may pack level numbers
into the name: div1, div2, etc.; this hardly seems an essential difference.
Containers for large units such as chapters and sections can be reliably inferred
from
level-specific headings. Conversely, heading levels can be inferred from container
nesting. Thus, several different arrangements of such elements can express the same
abstract ordered hierarchy.
A document can be translated without loss to an inverted
representation where each vocabulary item occurs once, with a list of the positions
where it occurs.
It may be unintuitive to claim such a document is the same as one represented in a
more conventional order; but the information content is the same:
XML is not only used for text documents. It is also used for many other
kinds of data, by which I mean data differing much more fundamentally
than the already substantial differences between (say) DocBook and TEI, or HTML and
RSS.
For example, music, databases, calendars, and knowledge representation; and, of course,
vector images.
How do the features and characteristics of text document models (and their essential
features) serve for analyzing such data?
Vector images are composed of discrete components of various types, with properties.
Rendered images such as photographs or freehand drawings, commonly represent objects
that are discrete in the real world (such as fruit in a still life), but they are
often
not discrete in the drawing: they overlap each other almost arbitrarily, and finding
objects is a matter of AI -- unlike most objects in text documents and vector image
representations.
Even before one considers the ontology of the objects depicted this distinction has
huge effects for authors and recipients:
Vector objects can be scaled without degradation: the scaled image may have an
arbitrarily different number of pixels (or no pixels, for example if displayed
via a vector scope or pen plotter).
A drawn shape that is overlapped by others, can still be picked up and moved
later. Each has a persistent existence. In that environment, consistent
appearance requires that objects are kept in some "stacking order": which is
drawn first matters.
Objects have identity, but also properties. A rectangle has a
certain position, dimension, color and thickness of border, and color of
interior (possibly including quasi-colors like transparent or
gradients). But those properties can be changed without changing the object's
identity.
Objects in vector images typically are instances of a category or class. In
many GUI applications the class is merely a shape: rectangle, star, etc., and
chosen from a visual menu. However, many applications include libraries of named
entities used in a certain field of work, each with a conventionally-associated
shape. Some applications (often ones with an accessible underlying program-like
language) allow users to define custom classes and subclasses, and a few allow
changing the class of a drawing object while retaining all applicable properties
(color, border line type, label, position, etc).
Vector objects have persistent identities as well as persistent existence.
This may be an internal code invisible to users, or a human-legible value (SVG
uses XML IDs).
Vector images can usefully be divided into two subcategories.
Some are maps, in which the layout and often the scale are part of the information
being conveyed. A floor plan for a building is clearly in this class. The fact that
one room connects to another is necessary but not sufficient information; the actual
dimensions of each room, the position of doors and windows, and so on are necessary
as well.
Another way to think of this, is that maps are not well represented as
mathematical graph structures. Creating a node for each room, hall,
closet, etc., and arcs connecting them might occasionally be useful,
but not for most floor plan uses. The same is true of a circuit-board layout (though
somewhat less so, since distances are only critical is some cases, such as extremely
high-speed interconnects or limiting EM interference, capacitance, etc.).
Diagrams, in contrast, have a much greater separation between layout and meaning.
An org chart is the same if it has the same people and jobs connected
in the same pattern. It may be sloppily laid out, with randomly differing box sizes
and colors, and lines crossing all over the place; that makes it ugly and annoying
to interpret, but little or no information has been lost. Electronic schematics and
exploded parts diagrams are also of the general kind (see below).[7]
Some connections may have an
order which is part of the essential information. The connections to a logic chip
must go to the right pins, and for some purposes genealogical charts must keep track
of the order of children and marriages. On the other hand, resistors in schematics
can go either way, and not all genealogical charts care about such orders. Even so,
the order of drawing need not always match the logical order.
In computer science, such diagrammatic information structures are called Graphs,
and they come in Directed and Ordered sub-types as needed. Instances (or layouts)
that preserve the same graph structure are isomorphic.
As to order, diagrams are very much like OHCO text documents. Printing each act, scene,
speech and stage direction in Hamlet in a random font and
size, but preserving their ordered hierarchical relations, would maintain the same
Expression. Components can sometimes be moved even more freely, so long as the logical
order(s) or reading are not broken: figures can be floated, footnotes moved, etc.
And usually some re-arrangements are inessential, such as re-line-wrapping, or setting
in one vs. several columns.
What I am calling maps cannot be
abstracted in the same way.
Vector objects can be combined to represent any image: at the limit, any bitmap can
be
represented by a large number of tiny shapes, as in CAD and animation models. However,
countless users use them for what I will call maps and
diagrams.
By maps I mean images that represent (often but not always to scale),
some real or putative object. For example, a building floor plan, a blueprint, a CAD
or
wireframe model, or an electronic circuit board (as in Figure 1); or even an artistic
drawing. In maps, the layout is often essential: a circuit board is meant to be realized
physically, with copper traces and holes that have to match up with real-world
electronic parts. Perhaps in that respect they are more akin to copyfitting than to
representations of document structure. Thus, maps have a very substantial difference
from the way we tend to conceptualize text documents (at least in the XML world).
I will instead focus here on diagrams, which are much less essentially tied to
physical layout. The circuit diagram (schematic) corresponding to the
board layout of Figure 1 is shown in Figure 2. In this case, dimensions and positions
are nearly irrelevant: the operative information is the components themselves (a term
even more literally applicable here than in text documents), and their logical
connections;. These are represented by conventional visualizations (actual resistors
are
not jagged, nor diodes triangular).
In designing an electronic circuit, one combines components drawn from a variety of
categories such as chips, resistors, switches, and many more. Subclasses are important
in some cases, such as non-inductive resistors or electrolytic capacitors. Components
have both common properties (say, maximum voltage rating), and type-specific ones
(such
as resistance). As with authors writing text, engineers design circuits by combining
and
organizing components (each with its own meaning and purpose), into aggregates with
higher-level meaning and purpose. In both cases the components and their organization
matter a great deal, but their physical placement usually matters much less to the
design.[8]
Like most other vector drawing apps, a circuit design tool provides objects of the
necessary kinds (or a user with enough free time can create them). A single drawing
tool
may or may not be suited for creating both of the figures above. If it is, it will
have
quite different properties and rendering even for the same component in the two cases.
Such tools may or may not know anything about the legitimate combinations or
constraints, just as an XML editor may or may not know about validation.
At the most basic level, then, diagrams consist of objects in various categorizes
(for
which particular images or shapes serve as proxies), connected in a certain way, and
rendered into some space. Both shapes and connections are often equipped with names,
numbers, and/or descriptive text. Particular shapes may be conventional symbols in
a
domain, or simply assigned by the author. This is very similar to how content objects
are applied in text documents, and in both cases authors may opt to bend or break
conventions, even so far as laying out every individual object uniquely.
A Canticle for Leibowitz (Miller 1959) is a story set in a post-Apocalytic future, where monks preserve what
is left of technical and scientific knowledge. Brother Francis discovers the distinction
of structure and layout for blueprints:
The knowledge that the color scheme of blueprints was an accidental feature of those
ancient drawings lent impetus to his plan. A glorified copy of the Leibowitz print
could be made without incorporating the accidental feature. With the color scheme
reversed, no one would recognize the drawing at first. Certain other features could
obviously be modified. He dared change nothing that he did not understand, but surely
the parts tables and the block-lettered information could be spread symmetrically
around the diagram on scrolls and shields. Because the meaning of the diagram itself
was obscure, he dared not alter its shape or plan by a hair; but since its color scheme
was unimportant, it might as well be beautiful. He considered gold inlay for the squiggles
and doohickii.... When Brother Horner illuminated a capital M, transmuting it into a wonderful jungle
of leaves, berries, branches, and perhaps a wily serpent, it nevertheless remained
legible. Brother Francis saw no reason for supposing that the same would not apply
to the diagram.
In text documents, connections between document components are usually expressed by
proximity, size, and relative indentation; though numbering, boxing things together,
co-indexing (such as footnote numbers), and other methods also come up. In diagrams,
all
of those may still be used, but connections are often represented by lines and arrows
of
various styles, as well as proximity and geometric containment (boxes that group other
boxes). Both modalities commonly co-index sets of similar things by co-ordinating
the
choice of font, color, and other obvious properties.
Diagrams are used in countless fields. Some familiar uses include: Slide
presentations, Organizational charts, Flowcharts (as well as UML diagrams, workflows,
state diagrams, etc.), schematics, some exploded parts diagrams, or even graphs and
charts.
Broadly speaking, vector graphics, like content-based document models, can be viewed
as an object-oriented notion.
SVG is an XML schema for vector graphic interchange that is widely supported. This
idea was so popular even at the start, that there were six competing submissions to
the
W3C in the Web vector graphics space the year work on SVG began
(https://www.w3.org/Graphics/SVG/WG/wiki/Secret_Origin_of_SVG).
Figure 3 is an extremely simple diagram: two boxes, of different types, each with
a
label, and joined by a connector. I will use this to illustrate many of the issues
involved in mapping between the level of diagrams in the abstract, and diagrams as
expressed in various apps and in SVG.
Most of the same issues of sameness and difference arise for diagrams as for text
documents:
Just how much can a diagram change, and in what ways, before it becomes a different
diagram? So the question arises: do such diagrams fit a conceptual model like text
document? Is (2) true?[9]
(2) A diagram is an ordered hierarchy of content-based shapes.
To decide, we must first analyze the structure and implications involved.
Related work on diagrams and documents with XML
There is a substantial literature involving diagrams in relation to documents, but
much of it examines visualization of documents as
images. Eliot Kimber (2013) presented tools to map text documents to slide
presentations. Wendell Piez (2018) has used XSLT to generate fascinating document
views
in SVG. Yves Marcoux (2008) created visualizations for GODDAG structures. Borovsky,
Birnbaum, Lancaster, and Danowski (2009) discussed visualizations for XML collections.
A
different take by Hugh Cayless (2008) applied SVG to link images and transcriptions.
Liam Quin (2015) discussed diagrams for visualizing XML structure, and noted that
In practice any graphical representation will almost certainly be used for both
exploration and storytelling, and so we see the wisdom in Alberto Cairo’s
observation that there’s a continuous spectrum rather than two distinct sorts of
picture (Cairo 2013).
Tomokazu Fujino et al. (2004) discuss tradeoffs in using Xml-Based
Graphics (SVG and X3D) in statistics:
The point that we would like to emphasize here is its portability. Many features
such as cooperation with other XML-based format and interactivity can be included
into one graphics file so it can be used not only as a part of an application but
also as an independent application itself. Even if it is generated on server side,
there is no traffic between the server and the client when the interactive function
works.... The big difference between FLASH and XML-based vector graphics would be
that one is closed binary file, and the other is XML format as open and standard
specification.
I find vector images (particularly diagrams) interesting as a modelling case because
they share some broad characteristics with text documents, yet have great differences
in
the kind of information involved. For example:
Vector image drawing programs provide users with an inventory of familiar
basic objects (shapes), such as various boxes, arrows, or schematic symbols.
This is very like word processors providing paragraphs, footnotes, index
entries, etc., or XML schemas providing an inventory of element types.
Shape instances have properties that can be set, many of which end up
affecting how they appear: colors, line styles.
Shapes typically have associated text such as labels, reference numbers, etc.,
just as most objects in an OHCO text document do. Diagramming apps often support
simple formatting within the text, but rarely any notion of content markup or
even named styles within it.
Many vector drawing programs include "snap to regions" and glue points on
shapes, that provide points of attachment for other objects such as arrows,
labels, or other shapes. These bear some resemblance to explicit anchors in text
documents, but are usually defined on a shape class rather than instances.[10]
Vector images are commonly converted to Image representations, but can be
represented at any of the modelling levels discussed earlier.
Getting higher-level representations back from pixel ones (moving
up) is a very common practical difficulty for vector image
users, as it is for text document users.
Moving up levels is especially hard for computers; it is easier for humans
(perhaps because humans intuitively grasp implicit structures), but extremely
tedious to carry out.
Vector images frequently contain a lot of natural-language text, if only in
small snippets such as box labels, callouts, etc.
There are also some key differences. For example, most apps have a grouping operation,
with which one can select multiple objects and join them into one. This has little
or no
effect on appearance, but after grouping one can select, move, and resize them as
a
unit. Groups, however, are not typically typed objects themselves. One can group some
shapes, apparently making a new symbol for the diagram's visual vocabulary, However,
the
group is (in many, perhaps most, applications) only an instance, not a class. It has
no
name or associated data beyond that of it parts. In some apps a group can be cloned
by
reference, in others copied, but neither operation works quite as in OOP. In most
drawing programs the set of shape classes is fixed, or modifiable only by writing
actual
code.
In SVG the <g> element accomplishes just such grouping. A group, however, can be
assigned an ID, by which it can be instantiated elsewhere (and modified by overriding
some properties). However, this is only (so to speak) a second-class class: one cannot
add new elements to the SVG schema.[11] In contrast, with any XML schema language one can define a new type, and
then instantiate it as needed.
This is a crucial difference. In XML, if one defines a new kind of element (or even
a
sub-class using one or another technique), then one can instantiate it and changes
to
the class affect all instances. For example, a new style can attach to it and affect
all
instances. Copies of vector groups, in contrast, often bear no relation to their
original, so changing properties of one does not affect others. SVG can <use> any
object by ID, dynamically copying it elsewhere; but many properties cannot then be
overridden. This puts the user back in the same position as with raster graphics:
The
group does not exist as a type, but merely instances (which may
happen to still look similar, or diverge).
This leads directly to tedious work to modify a shape. As with procedural markup,
one must either find enough common features to characterize just
certain instances (which may be arbitrarily difficult, or even impossible); or check
everything by hand.
Drawing programs rarely enable modifying or sub-classing predefined shapes. For
example, a workflow tool may have steps of certain specific types such as
edit, verify, and approve. At one point
they might all appear as rectangles, but later (or for a different audience or
publisher) as distinct shapes or colors. In most vector drawing apps there is no notion
of a class of shapes apart from how a thing is rendered. That is, the app
may well define rectangle or diamond, but users cannot
subclass any of those with a new name and/or a new appearance.
Commonly, a shape (say, a diamond) cannot be converted to another
(say, a hexagon) with the same general properties. The best one can do is create a
new hexagon, and manually set each
property. One can then at least copy that hexagon;
though changing all the new hexagons to a new color will require selecting
every one -- they have no true name by which they can be controlled.[12]
Vector programs vs. vector APIS
Vector image drawing tools come in two broad forms: GUI and programmatic:
deploying shapes by sketching or choosing from menus (like WYSIWYG), or writing
programs that call functions to make and modify shapes.
So far I have mainly focused on vector diagrams as managed through GUIs: the
equivalent of WYSIWYG word processing. Vector graphics, however, can also be created
programmatically. Instead of dragging a rectangle from a tool bar and using various
mouse actions to modify it, one can write something in a (quasi-)programming
language, such as:
new Diamond(loc="10 10 50 50", label="Analyze", bgColor="#F00", lineColor="#00F",...);
This approach easily supports factoring out repeatedly-used sets of properties.
SVG has a style mechanism fairly similar to CSS, and even re-uses many of the very
same styling properties.
Many drawing systems are integrated into full-fledged programming languages, in
which case the programmer can create new abstractions outside of the vector system
itself. One can address the lack of shape classes by creating functions to
draw shapes and then calling them as needed. For the earlier example
of distinguishing edit, verify, and
approve shapes, one can create a class for each, with render() or other methods as needed. The functions can change over time, much
like an element's style definition for XML.
SVG is similar to other
vector languages, but it has a few less-typical features, some of which naturally
follow from its XML base:
In SVG a shape instance (or a drawing of any complexity) can have an XML ID.
One can instantiate it at will by referring to the ID. Referring instances can
override some properties if desired. Because the reference is itself an SVG
construct, which has independent existence in the SVG drawing, changing it does change all instances. Instantiation can be multi-level.
One big difference, however, is that an SVG drawing tends to have a lot of
top-level objects, and their order matters far less than in most natural languages.
Scrambling the drawing order of vector shapes (or the SVG file order) makes no
difference except where they overlap. Most vector drawing systems address this by
a settable explicit
Z-order for shapes.[13]
SVG of course defines some XML elements and their semantics. But unlike text
documents, the elements have only fixed geometric meanings. Its most relevant
elements include text, circle, rect, line, path, textPath, polygon, g and use (the
grouping constructs mentioned above), and img (much like HTML). Thus,
SVG off the shelf operates essentially at the presentation or layout level. It is
not wildly inaccurate to think of it as procedural markup for drawing, with
g and use providing functionality somewhat like
troff macros.
Other elements define styles; deal with fonts, colors, and patterns; animate;
filter; and handle lighting. Attributes specify shape properties, apply affine
transformations, define IDs, and so on. But fundamentally, these are all in the
service of rendering a few basic shapes, which can be compounded into larger ones
(accomplished by the g (group) element).
SVG in typical drawing apps
SVG provides few native shapes, but there are customizable editors. Many vector drawing
apps provide enormous libraries of stock shapes, but also offer
export to SVG. What do these do, and how do they leverage SVG to model
their worlds? To examine the issues, we will use the extremely simple diagram shown
in
Figure 4.
This diagram was drawn with YEd, which provides both box shapes, as well as
multi-segment connector lines with arrows, and the ability to attach connectors to
boxes
(so the connector resizes if the box is moved). YEd can export SVG (discussed below),
but like many drawing programs, has a separate native format, in this case "GraphML"
(Brandes 2014).
I have shown YEd first because its model for diagrams is extremely focused on logical
structure of diagrams, with GraphML's fundamental objects being simply node
and label. Thus, it can express a graph structure entirely apart from
rendering or even labels:
This is about as bare-bones as one can get, but also (if labels are added) very close
to a respected approach to text documents: just structure, no rendering until downstream.[14]
Files created in YEd add a lot of information to specify box, label, and connector
formatting, but retain a basic structure like that above. Just as with many word-processors
that
export XML, YEd appears to simply write most (all non-default?) properties on every
shape instance, rather than writing style definitions once and then using them by
reference. When (as typical), there are only a few distinct styles but each is
used very many times, this costs substantial space, though the excess is easy to remove
with sed or XSLT.[15]
What happens, then, when YEd exports this diagram to SVG? One might expect a fairly
direct mapping:
SVG has a native rectangle object and plenty of style features, so the left
box could just be there.
SVG lacks a native document symbol, but YEd could write out an SVG group
element that draws its own stock version, add an ID (say, "yed_shape_document"), and
then
use it as if SVG had it natively.
SVG does not have an obvious way to position anything relative to the position
of a prior object; much less an auto-routing capability. So the connector must
end something like YEd's <y:PolyLineEdge> element above.[16]
In fact, the result takes a bit more work (I omit namespace and some formatting
attributes for readability). Note that each box is drawn twice, in slightly different
sizes, perhaps to inset the filled inner region (despite there being no fill), from
the
outer frame. Even the curved bottom edge for the document icon is drawn twice, via
an
explicit path:
Although not shown here, if the document shape were copied and re-used in the diagram,
the whole is repeated, but with the path giving gratuitously different coordinates.
Thus the number of objects, and kinds of objects in the user's mind
are unlike the SVG (after all, I picked the icons from the default menu of shapes):
I
draw two shapes, but the SVG has four (not even grouped).
There is nothing in the SVG to say they are instances
of the same type. If the shape were defined and instanced, the
path could appear once even though its final coordinates (and possibly other properties)
differ. The user's model of what is the same and different
would approximate the file's model. Even failing that, if YEd's name for each shape
type were merely put on an attribute, the members of this equivalence class could
trivially be related and processed together.
SVG does not provide a native way to associate a label directly with a shape -- for
example, via an attribute such as label="a big box" or
labelRef="#textObj12". So one can hardly blame YEd for having separate
text objects. However, those could easily be grouped with the corresponding boxes
(thankfully the text object seems always to be written to the SVG immediately
afterward).
To the reader I leave the design of a program to discover what shapes paths
should be mapped back to; though even if that were solved, the user might desire
different concepts that at one time or another happen to be rendered as the same shape,
just as in text documents authors may want multiple content objects all rendered as
italic.
YEd's native format is near the extreme of treating diagrams as mathematical graph
structures: the basic objects are just nodes and edges. Yet merely writing an SVG
group
surrounding each node and edge as they are exported to SVG, with a name for the relevant
shape type, would greatly aid search, transformation, conversion to other formats,
and
many other commonly-desired operations would be trivial rather than very difficult.
Similarly, factoring out highly-reusable things like definitions of how to draw their
shapes and factoring out formatting attributes would greatly decrease file sizes,
ease
readability and reusability, and make the stored components directly correspond to
those
the user created in the first place.
Perhaps, however, despite its very sparse and simple model displayed in GraphML, YEd
is just unusually poor at export? How do other drawing apps stack up on similar metrics?
inkScape
inkScape is a native SVG editor. It provides few shapes beyond SVG's native ones,
though it has a wide range of image operations and drawing tools. Like YEd, it has
no
automated connector routing. Labels are not a property of other shapes, but must be
manually created as separate text objects, then positioned and grouped.
Drawing the same diagram is a bit harder due to the lack of a pre-made document shape.
However, once drawn, one can clone the shape and move or resize the clone. However,
many properties of the clone can not override those of the original (for example,
color
or label text). So cloning doesn't work as well as it might.[17]
LibreOffice is an open source office suite. Its drawing component is not a native
SVG
editor, but can import and export formats including SVG, Visio, and others. For the
same
drawing, created natively and exported to SVG, the result is approximately as shown
below (edited for readability).
This is largely similar to YEd's output, but has one distinct advantage: It encloses
each object as a group (g), and assigns it a class which identifies it at
least as a LibreOffice shape. The com.sun.star.drawing.CustomShape uses
Java-style hierarchical naming. If the program merely added one more component, naming
the particular shape, portability would be easy. Names already exist for the user
interface, and probably have localizations in all of LibreOffice's language versions.
But even without that, at least there is one particular element in the SVG file
corresponding to each object the user drew.
Summary
SVG is XML applied to vector images. Some images (such as maps) are fundamentally
about their image, and thus quite unlike nearly all text documents as we commonly
think
of them. However, many vector images are about structures of related units. Examples
include org charts, flowcharts, schematics, and countless others. These are much more
similar to text documents, in being composed of discrete units that are members of
conceptual categories. The categories are often conventional in a given field, and
have
fairly conventional ways of being visualized. To make a diagram, such units are
instantiated, organized (commonly hierarchically), and placed in relation with each
other. All this is quite similar between text documents and diagrams.
Many relations are expressed by proximity and order in both domains. However, many
diagrams make heavy use of lines and arrows for connections, which are relatively
rare for text documents. On the other hand, text documents more often use hyperlinks
for long-distance connections, which are uncommon (but hardly absent) in diagrams.
For diagrams (though not for all vector graphics), an XML representation can easily
be
devised in which the elements correspond intuitively to the objects the user created.
This could be done either by creating a schema for each domain, or a generic schema
such
as <shape class="decision".../>.
However, a variety of current vector image drawing applications do not write SVG
anything like this. One, YEd, writes GraphML, an XML schema that is much like this;
but
even it write very different SVG. Apps writing SVG are reminiscent of word processing
around the time descriptive markup became prominent (hand-crafted SVG can easily be
better):
In practice, every vector drawing program has its own structures, tools, and
representation, and it is hard to move data from one to another.
Translations often fail to retain essentials of users' work, retaining only
the raw geometry.
Because of this, it is very difficult to edit or otherwise process such
SVG.
SVG written out from apps, rarely treats even the app's own native objects as
first-class SVG objects (for example, by defining and naming each shape, then
using them).
The difficulty does not seem to center around syntax, verbosity, or readability.
Rather, it seems to arise from a conceptual mismatch between the user and the software:
The user forms beliefs about what things there are, and in what groups (classes?)
they
go. Drawing apps' behavior gives strong justification for those beliefs through
the way it makes one operate, such as picking named kinds of things from a menu or
being unable to change the kind (shape type) of existing objects). Yet the
beliefs are almost always false. When one exports to a format which is widely known
as portable and capable of directly representing such ontological structures, the
result is not readily mappable to the user's model.
This seems very like procedural markup (despite SVG being an XML application): expression
of the user's conceptual categories is implicit or indirect.
The user's notions of which kinds of things are around, and how many, bear only indirect
relationships to the software's notion. Thus, reconstructing something even roughly
isomorphic to the user's model requires AI.
This kind of conceptual mismatch is hardly limited to documents and diagrams. Norman
(1988, 2013) has written extensively about such model mismatches in general. A
well-known example he gives is a refrigerator with these controls:
The manufacturer labelled two knobs, corresponding to a division the user knows and
uses: the freezer vs. the refrigerator compartment. The natural assumption is that
each
knob controls the temperature in the corresponding compartment. As Norman states:
A good conceptual model allows us to predict the effects of our actions. Without a
good model,
we operate by rote, blindly; we do operations as we were told to do them; we can't
fully appreciate why, what effects to expect, or what to do if things go wrong....
There is no need to understand the underlying physics or chemistry of each device
we
own, just the relationship between the controls and the outcomes.
Sadly, the intuitive suggested model can be wrong, as in this case. In fact one
control sets a thermostat in one of the two compartments, while the other apportions
cold airflow between compartments. Without the correct model in mind adjustment is
nigh
impossible; even with it, it is difficult because the relationship between the control
combination and the typically-desired effects is complex and indirect. This is much
like
the relationship between two identical document shapes in most of the SVG shown, or
between various ways of achieving a visual effect (say, for a block quote) in troff
or
another procedural formatter.
For the refrigerator user, the desire is almost always to make one compartment more
or
less cold; almost never to repartition an abstract amount of coldness between
compartments. Yet the common goal is very hard to achieve, while improbable goals
are
easy.
Similarly, when saving or exporting data of any kind, one might wish merely to ensure
that the data exists for re-use by the same application; in that case anything works.
But very often indeed, one wants to give the data to someone: a co-worker who may
or may
not have the same drawing program; themselves at a future time; a client who can only
handle certain formats; or even an archive. It will likely be possible to load SVG
such
as we have seen, but only to edit it after a fashion similar to editing a PDF or
page-scan. Many things will not work as one might wish:
YEd's doubled-up boxes are not a unit of any kind, so picking one up and
moving it, doesn't move the other.
Connectors are not attached to objects, so do not follow when objects move.
Auto-routing varies greatly in general, and several of these apps do not
auto-route at all -- but a poorly routed connector that still connected the
right objects, would normally be correct, even if unattractive.
When objects are drawn individually and have no overt type, they do not
readily map to corresponding objects in a target application. For example, most
drawing programs provide a document shape similar to that in Figure 1 – but in
no case tested will their saved SVG enable another app to use its corresponding
shape. This has practical consequences: if one imports SVG and then add a
document symbol at the destination, it will have a completely different look --
as if one added text to an imported text document, but could not get it into the
same font as the imported text.[18]
The wide variety of complex SVG that is written even for simple cases does not mirror
a real complexity in the user's model, and that mismatch leads to problems.
The potential solutions, however, seem slightly different here than with text
documents. Text work commonly uses a two-level model: The level of conceptual objects,
and the level of layout. The conceptual objects are dictated by the demands of a domain
(are there "stanzas" or departure times or code listings or notes?). Layout is
dictated by medium and by aesthetic and design choices, and executed mostly
automatically. OHCO then suggests that the former is the text, while the other is
epiphenomenal.
Does this approach suffice for diagrams? Diagrams do have conceptual objects, often
in
standardized sets (indeed, many drawing apps have named shape libraries for various
applications). It seems no harder to create a schema for a flowchart than for a poetry
collection. Many shapes used in diagrams have connection requirements: A choice box
in
a flowchart might expect at least one in connection, and exactly two
out connections, while vastly more complex rules can be stated for
schematics.
Diagrams also have layout, and as with text document this is more a matter of
aesthetics and design. Much diagram layout is manual (though not in tools like
Graphviz), but some is automatic. Connector routing is often automatic even though
it is
far more complex than most text-layout algorithms. Features like snap-to-grid can
be
enabled to impose simple layout rules if desired, and there are manually-invoked but
automatically executed processes such as lining up all selected objects, spreading
them
evenly along a path, and so on.
However, some large differences remain. With text documents, nearly all the
formatting properties of objects are separable, and applied uniformly to all instances
of a type of component. "Type" need not mean element type name, but can depend on
class attributes, context, or other intensions.[19]Diagram software only rarely provides a comparable mechanism, where the user
can define (say) three named types of connector lines, each with its own color, stroke
pattern, arrowhead style, or other properties – and then use them for three distinct
purposes, retaining the ability to change their look en masse. Many drawing programs
do
not even provide a way to define and modify color schemes – a familiar problem when
a
new projector, lighting, and venue affect contrast and require re-coloring an entire
presentation on short notice.
Each drawing tool has a palette(s) of shapes it knows about, usually all with names.
But many do not facilitate user construction of new shapes. And if they have SVG export
at all, they may not create SVG objects for each of their types, but only for each
of
the instances. It takes little effort to just write a shape the first time and put
an ID
on it (ideally, a readable one based on the user-visible shape name); then to just
<use> it by reference.
In fairness, SVG could make this a bit easier for SVG generators with a few features
such as relative positioning and more general property overrides.
Does an OHCO-like model fit SVG?
1: Order
In text documents, there is typically an overall reading order to all the text
(users of course may read in any order they like), though there are various
exceptions.
In vector graphics, it may seem that shapes are not ordered at all, unless as
communicated by the structure of flow diagrams, electronic schematics, etc.; and by
Z-order.
2: Hierarchy
In text documents, many content objects contain others.
In diagrams, shapes are composable into other shapes. Composition in diagrams
seems also to be of two kinds. First, many drawing programs provide a grouping
operation that makes new composite shapes. These that can then be instantiated in
various sizes and places, and even combined into further shapes. Second, end users
frequently place shapes inside other shapes to express relationships, such as
several staff forming a team, or several components forming a sub-assembly. Only the
second of these is comparable to text document authors assembling paragraphs, lists,
sections, speeches and the like (or even tables). The first is more like a schema
designer create new complex types.
3: Content-based shapes
With text, most of the content-based objects have names that are familiar to many
literate people: chapter, section, paragraph, list, and quotation are widely known
(if imprecise) concepts. A large share of content-based objects are the subject of
sections in style manuals, and taught in primary school.
Specialized domains of all kinds add their own vocabulary to English and other
languages. Specialized document genres such as semiconductor data sheets, invoices,
*nix man pages, and many more, introduce objects based on their
specific content needs, such as pin definitions, extended prices, command and option
names, and so on. This is unremarkable.
Many kinds of drawings are similarly conventional. People draw box-and-arrow
diagrams on napkins all the time. Flowcharts use standardized shapes to represent
processes, choices, documents, and other concepts.[20] Long before that, engineers
assigned conventional shapes to components such as resistors and batteries (as well
as abstract electrical notions such as ground) in order to draw
schematics. And in a reasonable sense, literacy itself is the result of assigning
conventional shapes to symbolize sounds and/or meanings. This differs from artistic
or idealized representations that are more iconic than symbolic.
What could we gain by applying OHCO concepts to drawings?
A common frustration in preparing technical drawings is maintaining consistency. For
example, a flow diagram or business process description, or for that matter a
slide presentation, commonly uses a small number of distinct concepts.
This problem is much the same as the problems for which XML was devised, albeit with
differences in detail. But in drawing, software often represents only the primitives
out of which the symbols are created, rather than the various kinds of symbols
themselves. A more descriptive approach might use different schemas for different
symbolic domains, with a more primitive drawing level filling a role comparable to
CSS or XSL-FO. This can fairly easily be implemented for a given domain, with XSLT
transforming the domain drawing schema to SVG, GraphML, or another schema with
direct rendering support.
Off the shelf, the state of the art for drawing appears slightly better than the
reactionary stage lamented by Coombs et al in 1987. Although the
readily available models may not be ideal, we at least have an open syntax, so it
is
much more feasible to get in and change things to one's liking, than with unlamented
binary file formats. Being XML, SVG makes raw, syntactic portability simple.
Implementations do not need to change their models, or even their user interfaces,
much at all to achieve far greater portability (in both directions!) and ease of
use. They can merely take their existing, fairly object-oriented model of
what drawings are, and map it directly into more usable SVG usage
conventions; in many cases this mapping is probably easier than what they are doing
now, because after a modicum of setup it can be much more direct.
References
Borovsky, Zoe, David J. Birnbaum, Lewis R. Lancaster and James A. Danowski. 2009.
The Graphic Visualization of XML Documents. Presented at Balisage:
The Markup Conference 2009, Montréal, Canada, August 11 - 14, 2009. In
Proceedings of Balisage: The Markup Conference 2009. Balisage
Series on Markup Technologies, vol. 3 (2009). doi:https://doi.org/10.4242/BalisageVol3.Borovsky01.
Brandes, Ulrik; Eiglsperger, Markus; Lerner, Jürgen; Pich, Christian. 2014.
Graph Markup Language (GraphML). In Tamassia, Roberto (ed.). Handbook
of Graph Drawing and Visualization (PDF). CRC Press. pp. 517–541. ISBN-13:
978-1138034242. http://cs.brown.edu/people/rtamassi/gdhandbook/chapters/graphml.pdf
Cairo, Alberto. 2013. The Functional Art: An introduction to
information graphics and visualization. New Riders. Cited in Quin (2015).
Cayless, Hugh A. 2008. Linking Page Images to Transcriptions with
SVG. Presented at Balisage: The Markup Conference 2008, Montréal, Canada,
August 12 - 15, 2008. In Proceedings of Balisage: The Markup
Conference 2008. Balisage Series on Markup Technologies, vol. 1 (2008). doi:https://doi.org/10.4242/BalisageVol1.Cayless01.
James H. Coombs, Allen H. Renear, and Steven J. DeRose. 1987. Markup
Systems and the Future of Scholarly Text Processing.Communications of the Association for Computing Machinery 30 (11),
pp. 933-947. doi:https://doi.org/10.1145/32206.32209.
Renear, Allen and David Dubin. 2003. Towards identity conditions for
digital documents. In DCMI '03: Proceedings of the 2003 international
conference on Dublin Core and metadata applications: supporting communities of discourse
and practice -- metadata research & applications. September 2003. https://www.academia.edu/2796396/Towards_identity_conditions_for_digital_documents.
Durand, David, Elli Mylonas, and Steven J. DeRose. 1996. What Should Markup Really
Be? Applying Theories of Text to the Design of Markup Systems. In Joint International Conference: ALLC/ACH. http://xml.coverpages.org/DurandWhatShouldTextBe-ALLC1996.pdf.
Ferraiolo, Jon. 4 September 2001. Scalable Vector Graphics (SVG) 1.0
Specification. World Wide Web Consortium. https://www.w3.org/TR/SVG10.
Tomokazu Fujino, Yoshikazu Yamamoto and Tomouki Tarumi. 2004. Possibilities
and Problems of the XML-Based Graphics in Statistics.COMPSTAT 2004: Proceedings in Computational Statistics. 16th
Symposium held in Prague, Czech Republic. Jaromir Antoch (ed). Section: Internet based
methods: 1043-1052.
Halliday, M. A. K. 1985. Spoken and Written Language.
Victoria: Deakin University Press.
Kimber, Eliot. 2013. General Architecture for Generation of Slide
Presentations, including PowerPoint, from arbitrary XML Documents. Presented
at Balisage: The Markup Conference 2013, Montréal, Canada, August 6 - 9, 2013. In
Proceedings of Balisage: The Markup Conference 2013. Balisage
Series on Markup Technologies, vol. 10 (2013). doi:https://doi.org/10.4242/BalisageVol10.Kimber01.
Marcoux, Yves. 2008. Graph characterization of overlap-only TexMECS and
other overlapping markup formalisms. Presented at Balisage: The Markup
Conference 2008, Montréal, Canada, August 12 - 15, 2008. In Proceedings of
Balisage: The Markup Conference 2008. Balisage Series on Markup Technologies,
vol. 1 (2008). doi:https://doi.org/10.4242/BalisageVol1.Marcoux01.
Norman, Donald A. 2013. The Design of Everyday Things: Revised and Expanded Edition. New York: Basic Books. ISBN 978-0465050659. 1st edition: 1988.
Peirce, Charles Sanders. 1998. The Essential Peirce. Volume
2. Eds. Peirce Edition Project. Bloomington IN: Indiana University Press.
Cited in Peirce’s Theory of Signs in The Standard
Encyclopedia of Philosophy, Nov 15, 2010. https://plato.stanford.edu/entries/peirce-semiotics/.
Piez, Wendell. 2014. Hierarchies within range space: From LMNL to
OHCO. Presented at Balisage: The Markup Conference 2014, Washington, DC,
August 5 - 8, 2014. In Proceedings of Balisage: The Markup
Conference 2014. Balisage Series on Markup Technologies, vol. 13 (2014). doi:https://doi.org/10.4242/BalisageVol13.Piez01.
Piez, Wendell. 2018. Fractal information is. Presented at
Balisage: The Markup Conference 2018, Washington, DC, July 31 - August 3, 2018. In
Proceedings of Balisage: The Markup Conference 2018. Balisage Series on Markup Technologies, vol. 21 (2018). doi:https://doi.org/10.4242/BalisageVol21.Piez01.
Quin, Liam R. E. 2015. Diagramming XML: Exploring Concepts, Constraints
and Affordances. Presented at Balisage: The Markup Conference 2015,
Washington, DC, August 11 - 14, 2015. In Proceedings of Balisage: The Markup
Conference 2015. Balisage Series on Markup Technologies, vol. 15 (2015).
doi:https://doi.org/10.4242/BalisageVol15.Quin01.
[1] There are many edge cases between text and non-text, such as tables,
form labels, index entries, program code examples, alt text for figures,
etc.
[2] Definitions of OOP vary considerably. Merriam-Webster
(https://www.merriam-webster.com/dictionary/object-oriented%20programming)
describes objects which communicate with each other, which may be
arranged into hierarchies, and which can be combined to form additional
objects. Wikipedia notes that objects have data, in the
form of fields (often known as attributes or properties), and code, in
the form of procedures (often known as methods).... objects have a
notion of 'this' or 'self')..., and that objects interact and
are class-based.
[3] This very basic model has been refined, extended, or replaced in various
ways; for discussion of a few, see Durand (1996).
[4] Some characters do have substantial meaning even in isolation:
mathematical symbols, ? µ ι @ $ ¶, etc. A few alternate with explicit
content objects such as word, sentence, or paragraph.
[5] Depending on what count as different characters (in digital form, the
encoding), additional distinctions may be managed. For example, Unicode's
"Mathematical" alphabet variations (italic, bold, sans serif, double-struck,
etc) could be used to represent emphasis.
[6] An interesting edge case arises when specialists annotate documents by
making grammatical structures overt. It can be argued whether this adds any
content objects to a document, since (to the extent the
analysis is accurate) they were already implicit in the linguistic content.
The phenomena certainly seem essential, but whether marking them up overtly
should be taken to change the document into another seems unclear.
[7] As mentioned earlier for text documents, it is possible for diagrams to make reference
to their layout, slightly blurring this distinction.
[8] There are significant exceptions, such as components that produce or are
sensitive to electromagnetic interference, or parts that must communicate
rapidly).
[9] A diagram may include text labels that are not in boxes at all. One can take
these to have an implicit shape (their bounding box), or adjust (2) to permit
some non-shapes.
[10] A tighter analogy might be if XML schemas provided affirmative places
within a given element type, where anchors could be placed: for example
a definition list's model might only permit inbound (or outbound)
anchors to (or just before) terms, not definitions. This
could be accomplished with various schema
languages, but is not common.
[11] One could, of course, define an entirely new schema that includes any classes
one likes, and then transform it to SVG as a step in rendering.
[12] A few applications may be exceptions, and in handcrafted SVG this is easier.
[13] SVG 1.1 drew things in document order; SVG 2 adds nestable stacking contexts.
[14] Though I will not discuss it further here, Graphviz (https://graphviz.org) has
a quite similar sparse node/edge model, with a similarly large set of options to
affect rendering, and a wide variety of automatic layout methods.
[15] This seems an odd choice, because it wastes far more space than XML's
purported verbosity ever did, makes consistency hard to discover or maintain,
and violates programming dicta such as don't repeat yourself.
[16] However, soueidan (2016) notes that one can simulate relative
positioning by nesting an entire <svg> within another, which
establishes a nested coordinate system that is then placed into the
container coordinate system
(https://www.sarasoueidan.com/blog/mimic-relative-positioning-in-svg).
[17] One can clone just the shape, not group it with a text
object, then add a separate text object that happens to overlay each clone. But
this fails to express the relationship.
[18] Yes, that does happen (or nearly so) with some word processors; but
rarely if ever with markup languages of any kind; and if it does, it is
relatively easy to fix compared to, say, the path-interpretation problem
discussed above.
[19] As noted earlier, there are exceptional element types that cannot be
adequately rendered or interpreted by generic tools, such as math, HTML canvas,
and the interactive aspects of links and forms.
[20] Flowchart symbols reached much their current form with ISO 5807 (1970), in turn based
on a 1960 ANSI standard.
Borovsky, Zoe, David J. Birnbaum, Lewis R. Lancaster and James A. Danowski. 2009.
The Graphic Visualization of XML Documents. Presented at Balisage:
The Markup Conference 2009, Montréal, Canada, August 11 - 14, 2009. In
Proceedings of Balisage: The Markup Conference 2009. Balisage
Series on Markup Technologies, vol. 3 (2009). doi:https://doi.org/10.4242/BalisageVol3.Borovsky01.
Cayless, Hugh A. 2008. Linking Page Images to Transcriptions with
SVG. Presented at Balisage: The Markup Conference 2008, Montréal, Canada,
August 12 - 15, 2008. In Proceedings of Balisage: The Markup
Conference 2008. Balisage Series on Markup Technologies, vol. 1 (2008). doi:https://doi.org/10.4242/BalisageVol1.Cayless01.
James H. Coombs, Allen H. Renear, and Steven J. DeRose. 1987. Markup
Systems and the Future of Scholarly Text Processing.Communications of the Association for Computing Machinery 30 (11),
pp. 933-947. doi:https://doi.org/10.1145/32206.32209.
Renear, Allen and David Dubin. 2003. Towards identity conditions for
digital documents. In DCMI '03: Proceedings of the 2003 international
conference on Dublin Core and metadata applications: supporting communities of discourse
and practice -- metadata research & applications. September 2003. https://www.academia.edu/2796396/Towards_identity_conditions_for_digital_documents.
Durand, David, Elli Mylonas, and Steven J. DeRose. 1996. What Should Markup Really
Be? Applying Theories of Text to the Design of Markup Systems. In Joint International Conference: ALLC/ACH. http://xml.coverpages.org/DurandWhatShouldTextBe-ALLC1996.pdf.
Tomokazu Fujino, Yoshikazu Yamamoto and Tomouki Tarumi. 2004. Possibilities
and Problems of the XML-Based Graphics in Statistics.COMPSTAT 2004: Proceedings in Computational Statistics. 16th
Symposium held in Prague, Czech Republic. Jaromir Antoch (ed). Section: Internet based
methods: 1043-1052.
Kimber, Eliot. 2013. General Architecture for Generation of Slide
Presentations, including PowerPoint, from arbitrary XML Documents. Presented
at Balisage: The Markup Conference 2013, Montréal, Canada, August 6 - 9, 2013. In
Proceedings of Balisage: The Markup Conference 2013. Balisage
Series on Markup Technologies, vol. 10 (2013). doi:https://doi.org/10.4242/BalisageVol10.Kimber01.
Marcoux, Yves. 2008. Graph characterization of overlap-only TexMECS and
other overlapping markup formalisms. Presented at Balisage: The Markup
Conference 2008, Montréal, Canada, August 12 - 15, 2008. In Proceedings of
Balisage: The Markup Conference 2008. Balisage Series on Markup Technologies,
vol. 1 (2008). doi:https://doi.org/10.4242/BalisageVol1.Marcoux01.
Peirce, Charles Sanders. 1998. The Essential Peirce. Volume
2. Eds. Peirce Edition Project. Bloomington IN: Indiana University Press.
Cited in Peirce’s Theory of Signs in The Standard
Encyclopedia of Philosophy, Nov 15, 2010. https://plato.stanford.edu/entries/peirce-semiotics/.
Piez, Wendell. 2014. Hierarchies within range space: From LMNL to
OHCO. Presented at Balisage: The Markup Conference 2014, Washington, DC,
August 5 - 8, 2014. In Proceedings of Balisage: The Markup
Conference 2014. Balisage Series on Markup Technologies, vol. 13 (2014). doi:https://doi.org/10.4242/BalisageVol13.Piez01.
Piez, Wendell. 2018. Fractal information is. Presented at
Balisage: The Markup Conference 2018, Washington, DC, July 31 - August 3, 2018. In
Proceedings of Balisage: The Markup Conference 2018. Balisage Series on Markup Technologies, vol. 21 (2018). doi:https://doi.org/10.4242/BalisageVol21.Piez01.
Quin, Liam R. E. 2015. Diagramming XML: Exploring Concepts, Constraints
and Affordances. Presented at Balisage: The Markup Conference 2015,
Washington, DC, August 11 - 14, 2015. In Proceedings of Balisage: The Markup
Conference 2015. Balisage Series on Markup Technologies, vol. 15 (2015).
doi:https://doi.org/10.4242/BalisageVol15.Quin01.