Quin, Liam R. E. “Diagramming XML: Exploring Concepts, Constraints and Affordances.” Presented at Balisage: The Markup Conference 2015, Washington, DC, August 11 - 14, 2015. In Proceedings of Balisage: The Markup Conference 2015. Balisage Series on Markup Technologies, vol. 15 (2015). https://doi.org/10.4242/BalisageVol15.Quin01.
Balisage: The Markup Conference 2015 August 11 - 14, 2015
Liam is on the staff at the World Wide Web Consortium, working
from home in rural Ontario.
Copyright ®2015 W3C® (MIT, ERCIM, Keio, Beihang)
Abstract
The growth in popularity of “Big Data” has brought an increased
interest in visualization; that is,
in software that draws graphical representations of data in the hope
that a human observer can spot emergent patterns. Visualization takes
data and shows a usually incomplete view.
People working with XML documents may have large amounts of data to
visualize, in which case some of these techniques can be very effective.
People may also wish to explore constraints on the XML documents, such as those typically
expressed in an XML DTD or schema.
This paper will focus primarily not on data visualization of trends
but on explicit visual representation
of relationships: that is diagrams
that illustrate a DTD schema, an XML instance, a universe of possible
instances, and relationships between these things. Techniques used will
focus on producing diagrams intended for use in Web browsers using
libraries such as D3.js.
An underlying motivation for this work is to explore affordances
peculiar to XML documents: to ask, what makes XML special, or, what
would need to be reinvented if one were to abandon XML for other systems.
A visualization in general is a
spatial presentation of data or information: the effectiveness of a
visualization depends on the human ability to process what is seen and to
recognize patterns. The term includes diagrams and drawings of concepts,
illustrations, and, of particular interest here, data visualizations, in
the sense of Owen 1987.
A data visualization (for the purpose
of this paper) is the act of (or the result of) presenting data using an
automated algorithm (possibly with manual intervention or tweaking).
Vis
A diagram (for the purpose of this
paper) is a visualization of information that is explicitly designed by a
human: one might think of the distinction between a visualization in
general and a diagram in particular as similar to the difference between a
graph plotted based on a formula and a drawing.
The object of this paper is
to explore ways to create visualizations and diagrams relating to XML documents and
their schemas,
diagrams that are specific to XMl and the ways that XML is used.
The goal of these graphics is that they be in some measure:
Elucidatory
that is, that they form a part of an explanation of the XML in some way;
Pedagogical
that is, that the viewer may learn from the graphics;
Æsthetic
that is, pleasing to the eye.
Where possible, the graphics should also be accessible; that is, interactive features should be
available for example to someone using assistive technology such as speech
to text or a keyboard with no mouse or pointing device. These graphics are
an alternate representation of schemas, DTDs and XML documents, and it
does not always make sense to try to make them accessible to blind users
where the alternate text description would typically be the DTD or other
schema fragment being illustrated, but they should certainly work for
people who are colour-blind for example.
The intended audience of the diagrams consists primarily of
people creating documents and people trying to understand potentially very large XML
schemas;
some of the diagrams are also visualizations of collections of XML documents and information
within them and may be of more general interest.
Diagramming an XML Document
Before starting to draw pictures we should understand the nature of an
XML document. In a later section we will explore Document Type
Declarations and various sorts of schema languages and constraint systems,
but we should start with something as simple as possible.
The term XML document is defined only
indirectly in the XML Specification: A data object is an
XML document if it is well-formed, as
defined in this specification. In addition, the XML document is valid if
it meets certain further constraints. However, we also have,
The function of the markup in an XML document is to describe
its storage and logical structure and to associate attribute name-value
pairs with its logical structures
XML 1998. In other words we have a pragmatic test for
whether something is an XML Document rather than a clear statement of what
the term XML Document might mean. We read
also that an XML Document has two aspects (and, lacking a third, cannot be
divine): its storage structure, also called physical, and its
logicalstructure. The physical aspect is sometimes referred to as an XML
file, although in fact a single XML document might be formed from any
number of computer files and a computer file much contain data for
multiple documents.
An XML Document, then, may exist at several levels. Some of these are
as follows:
A representation inside computer storage (for example, a
sequence of blocks on a hard drive)[1];
A stream of data that forms a sequence of binary-encoded numbers,
usually each of 8 bits (that is, having a range from 0 to 255),
achieved by interpreting the binary numbers as “bytes;”
A sequence of Unicode characters, achieved by interpreting the
byte stream as being a sequence of encoded characters in which one or
more binary numbers are combined to form each Unicode character[2];
An in-memory representation of an XML document, perhaps accessible
using the methods described in the XDM XDM 3.0 or DOM ; the XML
specification does not mandate any particular storage
mechanism;
Note that an application is free to build any data structure its designer chooses;
this does not have to be complete. A program that extracts colours used by an SVG
diagram
would obviously not need to store the rest of the input, even temporarily.
This representation is achieved by parsing the sequence of Unicode characters using
the grammar rules in the XML specification;
An on-screen rendering of the in-memory representation,
achieved for example using a CSS-based tree-renderer such as a Web browser,
an XML editor’s user interface, an SVG graphics editor, and so on;
A printed representation of the XML document,
perhaps achieved using XSL-FO, XML or HTML with CSS,
proprietary tools such as InDesign, or with some other toolset.
We can start with any of these; which is appropriate depends on our
purpose. For example, someone developing a parser may want to see a
representation of the input stream as a sequence of characters, to debug
the mapping from character encodings to Unicode. Most readers of this
paper are probably using existing XML tools so we will start instead with
a logical representation.
The diagram in Logical Representation is remarkable partly in its ugliness but chiefly in how
little it tells the viewer. It shows a parent-child relationship
between a book element and its contained title and chapter
elements. The intent of such diagrams is usually two-fold: first, to give an indication
of document hierarchy;
second,
to reassure the newcomer to XML that the logical model differs from the so-called
“physical” model of markup characters and text. In fact, such a distinction is not
as clear-cut as the beginner might like.
A problem with this diagram is that the individual lines are not significant; neither
are the boxes.
Figure 2 (arguably) conveys the same information more succinctly.
We draw boxes and lines perhaps because they are familiar from other people who draw
boxes and lines, and perhaps influenced by a variety of boxes-and-lines diagrams in
computing.
But this is the first of a number of examples in this paper where
the “traditional” computer science paradigm is not
a perfect fit for XML documents. Boxes are used to group things, but
we only had one item in each box. Lines are used to indicate explicit parent/child
references but in fact XML documents do not have such things. Instead, the parent/child
relationship is inferred from containment.
A diagram showing a representation closer to how an XML document might
be stored in computer memory is shown in Figure the Third, where the arrows represent direct reachability: an element object
might contain a pointer or reference to a linked list of contained
elements. Such a diagram is useful for computer programmers implementing
XML-based systems but not so useful for authors and schema designers. The
programmers may prefer (or need) more complete information, too, and,
again, we will return to this with a UML-style example. Note that from a
data structure perspective an implementation of a relationship that can be
followed in either direction typically involves a pair of pointers, using
more memory than unidirectional links and possibly requiring more care on
the part of the programmer to maintain both ends of links
correctly.
In considering how XML documents may be stored we have strayed from
logical models of XML documents to physical, where we started this section.
The primary audience for this paper is not programmers writing their own
XML parsers, since there are already a great many XML parsing libraries.
The simple diagrams we have shown so far do not show the textual content of a document,
nor
do they show XML attributes. One reason for this[4] is that a horizontal tree quickly runs out of room, both in print
and on a computer screen. Transposing the tree can make more room. In Figure 4
XML containment has been indicated with nesting. Attributes are still not shown.
One of the goals of this paper is to apply some basic principles of
graphic design to XML diagrams. We’ll gradually introduce ideas from graphic design
in the course of the paper, but let’s start by adding some design flexibility.
One way to get more room for the text in our diagram is simply to rotate it, as shown
in
Figure the Fifth. This does save space but at the expense of readability.
So far we have used only position to show hierarchy.
Other ways to indicate hierarchy use size and colour. One way to use colour is to
invoke
artistic landscape composition and the associated phenomenon of human perception of
distance,
in which the brain perceives parts of an image coloured red and more highly saturated
to be closer to the viewer than objects coloured blue and with less intense colour.
The example in Figure the Sixth misuses this in a naïve manner.
The problem is that, in a transcription of a printed book or manuscript,
element names are generally less important in the hierarchy than the text,
but the diagram shows them as large, warm-coloured foreground labels.
A similar problem is common in highlighting colour schemes for computer programming
languages: the keywords are often shown in bold, with variable and function names
in grey or italics.
It should if anything be the other way round.
In this section we have used some simple diagrams to illustrate an XML document.
Our purpose has primarily been to lay foundations for later sections:
these diagrams are not XML-specific. In the next section we will contrast generic
diagrams with diagrams specific to a specific subject matter or domain.
Schemas and Potential Documents
XML has the distinction of being the only markup system in widespread
use today in which documents are commonly constrained in terms of what they can contain. An XML
vocabulary can define not only a set of names, such as invoice, payee,
total-amount, item, but also a set of rules that limit the content of
elements, their placement, the relationships between them, and their
quantity. One might want a rule that if you add up all of the individual
item prices the total must match. A text-book publisher might insist that
documents to be published in a particular series or imprint contain
information on author, title and date at the start, together with chapters
that each have a title and one or more paragraphs followed by a closing
summary and student questions.
Some people come to an XML schema wanting to understand it for the
purpose of changing it; others want to write software that processes
conforming documents; in most cases (one hopes), the majority of people
approaching the schema want to create conforming documents.
A graphical representation of the constraints surrounding an XML
vocabulary could be a visualization,
intended for exploration, or it could be a diagram, created by a designer for expository purposes. In
practice any graphical representation will almost certainly be used for
both exploration and storytelling, and so we see the wisdom in Alberto
Cairo’s observation that there’s a continuous spectrum rather than two
distinct sorts of picture Cairo 2013.
Let’s begin with considering a particular schema. For the purposes of
this paper we will use the term schema to
mean any set of constraints on documents, whether prescriptive or
descriptive. Rather than draw the entire schema we’ll attempt to draw the
content rules for a single element.
Elements in XML have associated attributes with values, they have
contained elements, and they may also have immediate textual content. It
is unusual for vocabularies to constrain the use of processing
instructions of XML comments and although it sometimes done we will not
consider it in this paper, because when it is done it is usually a special
case of constraining elements. So our first attempt at drawing an element
surrounds it with its attributes and the child elements it might contain;
this is shown in Figure the Seventh.
The vocabulary illustrated is NISO JATS, although that is not of
particular concern in this paper. For our purpose what matters is the idea
of illustrating a schema. This is not a new idea; a review of older
diagrams shows us a wide range, from largely textual depictions of SGML
DTDs shipped with SoftQuad Panorama™ in 1994 through Microstar Near and
Far Designer diagrams all the way to complete UML and entity-relationship
diagrams. A Near and Far Designer style diagram[5] is shown in Figure the Eighth (redrawn by the
author using Inkscape).
The Near and Far Designer diagram shows several things at the same
time: the front element has a tilde after
its name to show that an instance of that element in a document can accept
attributes; a question mark at the start of a name indicates optionality;
an asterisk indicates a group of elements of which any number (including
zero) can appear in any order in a sequence. There is no explicit
indication of sequence (it is implied by vertical juxtaposition) and no
immediate indication of what any of the sub-elements might contain (in the
application this could be obtained by actuating the represenation of an
element, wherupon the application would insert the diagram for the element
concerned, expanding the tree. This diagram format was marketed primarily
for use by people working directly with an SGML or (later) XML document
type definition and doing document analysis rather than for people editing
a document, even though many people working with documents also found it
useful.
The principle of progressive
disclosure suggests that a diagram intended for authors
should not go beyond immediate needs and should not show grand-child
elements. But this would fail to take into account the goal that an author
might have to insert an element that is not shown. In addition, in many
XML vocabularies, most elements can be given attributes such as xml:lang or id
so the indication of attributes is not very useful. Figure the Ninth shows an initial attempt by the author to
produce an alternative design. Here, the hierarchy of potential
containment is shown with text size as well as colour, and the lines are
in colour to exploit a foreground/background visual illusion, in that they
can be seen as primary (with a reddish tint) or as background (thin, not
saturated, actually purplish). The names of potentially-contained elements
have been rotated in a rough spiral around the names of their putative
parents. From a design perspective this makes them secondary: the goal is
to give indication of the elements beyond the current focus. This could be
combined in an interactive application, for example by highlighting the
elements that can directly contain text; Figure the Tenth does this using colour, but now something non-obvious has been
introduced that requires a secondary explanation. In addition, neither
sequence nor the presence of attributes is represented in these village
diagrams; only the fact of possible containment.
The goal of Village diagrams is to give a quick indication of
potential containment. The diagrams make little attempt to show the
ordering of potential child elements, leaving that task to authoring
software: in a content model of the form (a, b, c, (d | e | f)*, g, h, c)
the initial elements are placed in required order but all other potential
children are sorted alphabetically; the diagrams are to be read clockwise
starting at the top of each cluster. These diagrams may appear
“friendlier” or less intimidating than full UML-style diagrams, or even
than the Near and Far Designer pictures, to people less accustomed to
thinking in terms of complex abstract containment, athough this hypothesis
has not been tested in usability studies.
We will return to Village Diagrams when we discuss interactivity, and
we will return to the relationship between a document and its schema when
we go beyond two dimensions. This section has illustrated a distinction
between a tree constructed from a particular XML document instance (an
actual tree) and a (possibly unbounded) set of possible trees implied by a
schema. The difference means that representing a schema as a tree can be
misleading: any given element might be allowed to appear in multiple
points in a hierarchy in an instance document so that the grammar implied
is not a strict tree: an element may have more than one potential parent.
None the less, trees have been used for representing information for
hundreds of years and their familiarity should not be underestimated.
Domain-Based Visualizations
The Extensible Markup Language, taken with the entire “XML Stack” of
specifications and technologies, is really a complex system for defining
domain-specific markup languages. These are sometimes called a vocabulary or, using an older term from SGML
days, a tag set. The term vocabulary is
used in this paper to mean a set of element names, together with attribute
names, used in a particular way for any particular sort of document. A
vocabulary often has constraints, such as requiring that an invoice has a
total amount. Languages for expressing these constraints will be discussed
in a later section under the generic name of schemas.
Consider the situation in which one is given a set of XML documents
and assigned the task of understanding them in some way. You might need to
know which element names were used, or which attributes appear on which
elements, or what is the longest single line of text, or whether any 8-bit
characters occur. But more likely is that your task is to understand some
things about the information represented by the documents.
For sighted people, an efficient way to understand a lot of data is
often to have the computer draw a picture of the data. Pictures and
diagrams that are used to explore possible relationsips in data rather
than being used to explain already-understood relationships to other
people are called data visualizations.
Figure the Eleventh is a chord diagram generated by
extracting cross-references from a dictionary of biography; the arcs
represent entries that mention other entries. This sort of diagram is
excellent for quickly finding “clusters,” or groups of related data—in
this case people referred to often or people who refer to many others. The
entry for Raphael in this diagram, for example, is connected to the entry
for Peruguino and also to that for Duerer. A link from the article for
Peruguino makes a connection that is here easily visible but might
otherwise, in a dictionary with approximately 10,000 entries, be
overlooked. A similar diagram showing places mentioned could be overlaid
to give an idea of the likelihood that people worked together or knew one
another.
A chord diagram can be a useful adjunct to other navigation systems for a large
corpus, especially when generated automatically. It can also be made interactive;
we will explore this in the section on interactivity later in this paper.
Important for our purpose here is that we can exploit a mixture of XML markup,
domain knowledge about the document to find which element represent a person’s biography,
and schema awareness to find ID values.
Representing Paper
Figure the Twelfth shows a visualization made by
overprinting successive pages of a book. The top of the figure is fairly
regular, with shadows caused be ascenders and descenders; towards the foot
of the page one can see the double-column shadows of footnotes. The faint
line at the very bottom is where a double-page spread was set longer,
probably to avoid an awkward heading placement on the next page.
The example of overlaid printed pages shows how a visualization might
be of use to different people. If you show a printed book to an
experienced printer or publisher you’ll see them lift the book up to the
light and peer through the paper. They are testing the quality of “back
up,” the alignment of lines of type on both sides of the paper, because
care taken in this affects the amount of “show-through,” distracting and
unsightly shadows from the past and future appearing between the lines of
the text.
Beyond Two Dimensions
We have seen examples of diagrams in which text is rotated, either as
a partial circle or,
as in the village diagrams, at other angles. In this section we will examine some
visualizations that use three spatial dimensions.
A three-dimensional diagram can be used to provide extra space,
or to show increased detail in the centre of a picture without entirely
losing the information at the edges.
This sort of diagram is a simulated three-dimensional projection.
Figure the Thirteenth shows a village diagram projected onto the surface
of a three-dimensional surface, giving an egg-like effect. This turns out to have
more visual appeal than practical application, but similar “hyper-focal” elliptical
projections
have been used in commercial product, including SoftQuad’s HoTMetaL Pro™, with
some success. What makes these projections useful is that you can move the “lens”,
or rather, move the “paper” beneath a fixed lens, to magnify different parts of
the diagram at pleasure.
Another approach to a third dimension is to construct virtual three-dimensional models
which a user can then manipulate, examine, rotate and explode.
Figure the Fourteenth shows a deconstructed view of a single page of a printed
book. In the figure, the page apparatus, annotations and cross-references, the central
Biblical
text and the commentary are each represented as a separate layer. This is a three-dimensional
model created using the libre software package
Blender. A Web-based object renderer can show this object
and allow a user to rotate it, to zoom in and view it from different angles, and,
like the present
writer, to become lost and confused. The figure uses a background gradient in order
to help
orient the viewer; this is especially helpful if the view becomes accidentally inverted.
Until such time as user interfaces for Web based model viewers are designed
with user experience in mind, three dimensional work seems to be of limited utility.
In addition there is today difficulty in software portability, although Web browser
support
for 3D object rendering is improving and, at the time of writing, becoming portable.
Interactivity and Animation
The Chord diagram in Figure the Eleventh was generated
using XQuery (with the BaseX implementation) to generate a JSON document
read by a JavaScript that uses the D3 library. The layout mechanism used
is a parametrized force directed layout Meirelles 2013
pp. 64ff. Thomas 2015 contains a worked example that may
be instructive.
It is possible to generate a D3 diagram from XML rather than
from JSON, and even to use D3 to animate or lay out an existing SVG document. However,
almost all of the easily-available examples use JSON, at least in part because of
the
simplicity of loading JSON in JavaScript.
Using JavaScript in the Web browser brings an added dimension to the diagrams.
We can animate them and make them respond to the mouse pointer.
Figure the Fifteenth is drawn and laid out using the D3.js JavaScript library;
it shows that an article element can contain
sub-article,
front,
body and
floats-group sub-elements and gives indication
of what each of those elements may in turn contain.
in this version the text labels are all horizontal, but the final layer of potential
containment are again drawn in a spiral. They use an italic typeface because
italic is economical of space and has relatively high readability when displayed
at an angle.
If you drag a mouse pointer over one of the element groups in the diagram,
nothing will happen, because this paper uses static reproductions of the diagrams.
However, in a live version the sub-elements are brought to the “foreground” by
becoming darker, as shown in Figure the Sixteenth
A combination of interactivity and animation can be effective in
encouraging users to explore a diagram. An alternate way to present a tree
is to use nested circles to indicate containment, and Figure the Seventeenth
shows a circular treemap illustrating DTD-based potential containment.
This diagram also includes attributes (prefixed with an @-sign), and marks
optionality and cardinality with *, ? and + as in a DTD. Circular treemaps use
a lot of space, unfortunately, but they may be helpful in teaching people
about element containment.
A more traditional presentation of a DTD uses a tree;
Figure the Eighteenth shows such a tree, drawn using
D3.js and software to parse a DTD and generate JSON; it would be
possible to write that software in JavaScript and have the Web browser
read the DTD directly, or to transform an XML Schema using XSLT.
Figure the Nineteenth shows the same diagram after
a user has interacted with it by pressing some of the boxes to the right of the labels.
Pressing once expands, and pressing again hides the sub-tree.
A space-saving tree layout is used, creating a relatively compact view
despite the apparently wide spacing. The expansion actually occurs over
approximately one-fifth of a second, so that there is a strong sense of
expansion and collapse. The motion turns out to be quite rewarding and
strongly encourages exploration.
The interactive nature of a Web browser with JavaScript can blur the distinction
between document and application. The power this gives must be balanced by the responsibility
to create Web pages and applications that people can actually use.
Interactivity, Foreground and background
The chord diagram from the previous section is already complex enough
that it becomes difficult to use on smaller screens. An interesting
application of chord diagrams might be an interface to allow users to
explore containment in a schema, showing the might-contain and
might-be-within relationships. But in a large schema the circle would be
too large.
One way to mitigate visual complexity is to use the human vision
system’s ability to discriminate between foreground and background. The
use of colour has already been mentioned in the discussion of Village
Diagrams, with warm colours (red, orange) being perceived as closer to the
viewer than cold colours (blue, green). The use of focus can also help the
vision system to “pre-process” an image and draw attention to pats
perceived as foreground: blurring the edge of items as a function of
distance leaves the sharp-edged foreground items prominent.
Complex three-dimensional diagrams benefit greatly from simple
artistic techniques such as systematic colour choice and blurring to
simulate distance. Although the three-dimensional diagrams are too complex
to achieve with D3 and JavaScript in the scope of a short paper, the
techniques also apply to simpler diagrams. Blurring and fading in
particular can be an effective alternative to “progressive disclosure,”
indicating where further information is available but without distracting
the user. A sharpening technique with colour value is used in the
interactive Village Diagrams to bring an element’s potential children to
the foreground when the user indicates interest.
Related Work
People have been writing programs to draw diagrams based on SGML and
XML documents. Wendell Piez, David Birnbaum, David Dubin, Michael
Sperberg-McQueen and others have presented diagrams at Balisage and the
preceding series of conferences, Extreme Markup.
The use of tools such as D3 with XQuery is not new, although using the
D3 library to draw XML-specific relationships appears at best
uncommon.
The primary contributions in this paper (the author hopes) are the use
of the Web browser and JavaScript to make interactive diagrams and
visualizations; systematic use of principles and technique from the fields
of graphic design and representational art; an emphasis on some
XML-specific sorts of relationships such as potential containment coupled
with a minimalist reductionism to try to make the diagrams simple and
clear.
Conclusions
XML Documents exist as part of a rich ecosystem of schemas, constraints,
potential documents and actual documents, transformations and relationships.
Data visualization and diagramming techniques can help people to perceive
and understand relationships in new ways.
The Open Web Platform has matured to the point where Web browsers can
display SVG and can perform sophisticated visualizations that can be
created, modified and even animated with JavaScript; JavaScript libraries
such as jQuery and D3.js simplify the work of doing this considerably.
This paper has shown examples of visualizing: information in XML documents;
XML documents and their structure; XML documents and their relationships with
XML Schema constraints;
and also the universe of potential XML documents that an XML Schema defines.
Some of these visualizations are unique to constrained markup languages such
as XML with a Schema, and perhaps also help to illustrate some of the value of
using a Schema even if that Schema is not prescriptive.
Finally, some discussion of graphic design and artistic composition has
been applied to visualization and diagram techniques; this is a subject
for which there are many examples and few tutorials.
The author hopes this paper will encourage readers to explore visual
representations for themselves, and also that the paper will help readers
to explain some of the benefits of using XML to other people.
References
[XML 1998] Bray, Tim, Paoli, Jean,
Sperberg-McQueen, C. M., Maler, Eve and Yergeau, François, Extensible Markup Language (XML) 1.0 (Fifth
Edition), W3C, 1998; the latest version is always online at
http://www.w3.org/TR/REC-xml/
[Cairo 2013] Cairo, Alberto,
The Functional Art: An introduction to information
graphics and visualization, New Riders, 2013. A useful book
with significant amounts of discussion and examples used to illustrate
points rather than being chosen primarily (or only) for aesthetic
reasons.
[Lima 2013] Lima, Manuel, The Book of Trees: Visualizing Branches of
Knowledge, Princeton Architectural Press, New York, 2013.
Includes both an historical perspective and clear descriptions of a number
of ways of displaying trees based on a simple category
system.
[Meirelles 2013] Meirelles,
Isabel, Design for Information, Rockport,
2013. Many examples and some principles,such as figure/ground, influenced
by graphic design.
[Owen 1987] Owen, Scott, ed.
HyperViz - Teaching Scientific Visualization Using
Hypermedia (A project of the ACM SIGGRAPH Education Committee,
the National Science Foundation (DUE-9752398), (DUE 9816443) and the
Hypermedia and Visualization Laboratory, Georgia State University); see
article Definitions and Rationale for Visualization, last
updated October 1999.
http://www.siggraph.org/education/materials/HyperVis/hypervis.htm.
[Robinson 1997] Robinson, Peter. What text really is not, and why
editors have to learn to swim, in Literary and Linguistic Computing,
Vol 24, No. 1 (2009). doi:https://doi.org/10.1093/llc/fqn030. Originally written in 1997.
[Thomas 2015] Thomas, Stephen A.,
Data Visualization with JavaScript, No Starch Press, 2015.
Although the coverate of trees and tree-like structures is limited, this book makes
few
assumptions about the background of the reader and is particularly helpful for those
people
less confident with the JavaScript language.
[1] We will for the purpose of this paper ignore details of file
system block allocation, mainframe fixed record padding,
cross-system byte size variations, and many other technical
details, and will consider a computer file to be a sequence of
bytes, each of which has a numeric value in the range 0..255
inclusive, while recognizing that this is itself an
abstraction.
[2] Strictly speaking the sequence can use a non-Unicode encoding,
but the resulting XML document, at least conceptually, is made
from a sequence of Unicode characters.
[3] A more efficient approach (in both space and time) is to use a
finger-tree, but that is wandering outside the scope of this
paper.
[4] apart from the fact the author didn’t include them in the pictures!
[5] Version 3 of MicroStar Inc.’s Near and Far Designer included both
SGML and XML DTD support.
Bray, Tim, Paoli, Jean,
Sperberg-McQueen, C. M., Maler, Eve and Yergeau, François, Extensible Markup Language (XML) 1.0 (Fifth
Edition), W3C, 1998; the latest version is always online at
http://www.w3.org/TR/REC-xml/
Cairo, Alberto,
The Functional Art: An introduction to information
graphics and visualization, New Riders, 2013. A useful book
with significant amounts of discussion and examples used to illustrate
points rather than being chosen primarily (or only) for aesthetic
reasons.
Lima, Manuel, The Book of Trees: Visualizing Branches of
Knowledge, Princeton Architectural Press, New York, 2013.
Includes both an historical perspective and clear descriptions of a number
of ways of displaying trees based on a simple category
system.
Owen, Scott, ed.
HyperViz - Teaching Scientific Visualization Using
Hypermedia (A project of the ACM SIGGRAPH Education Committee,
the National Science Foundation (DUE-9752398), (DUE 9816443) and the
Hypermedia and Visualization Laboratory, Georgia State University); see
article Definitions and Rationale for Visualization, last
updated October 1999.
http://www.siggraph.org/education/materials/HyperVis/hypervis.htm.
Robinson, Peter. What text really is not, and why
editors have to learn to swim, in Literary and Linguistic Computing,
Vol 24, No. 1 (2009). doi:https://doi.org/10.1093/llc/fqn030. Originally written in 1997.
Thomas, Stephen A.,
Data Visualization with JavaScript, No Starch Press, 2015.
Although the coverate of trees and tree-like structures is limited, this book makes
few
assumptions about the background of the reader and is particularly helpful for those
people
less confident with the JavaScript language.