|
Balisage 2013 Program
Tuesday, August 6, 2013
|
Tuesday 9:15 am - 9:45 am The semantics of
“semantic”
B. Tommie Usdin,
Mulberry Technologies There was a time when I knew what the word
“semantic” meant. That was a long time ago. Since then many people, on many
occasions, in many contexts, have corrected my misunderstanding of the meaning of
semantic. Perhaps it means nothing, or everything. Or perhaps I’m simply
misinformed.
|
Tuesday 9:45 am - 10:30 am icXML: Accelerating a commercial XML parser
using SIMD and multicore technologies
Nigel Medforth,
International Characters, Inc., and Simon Fraser University; Dan Lin, Simon Fraser University; Kenneth Herdy, Simon
Fraser University; Rob
Cameron, Simon Fraser University and International Characters, Inc.; &
Arrvindh
Shriraman, Simon Fraser University Earlier research prototypes
have shown how SIMD (single-instruction multiple-data instructions) and multi-core
parallelism can accelerate XML processing. We have tested how well those techniques work
when fully integrated into a commercial XML parser. The Apache Software Foundation’s
Xerces C++ parser was restructured for SIMD and multi-core parallelism, while retaining the
existing application programming interface unchanged. SIMD techniques alone produced a 50%
increase in parsing speed; when pipeline parallelism on dual core processors was added,
improvements of 2x and beyond were realized. (Proceedings)
|
Tuesday 11:00am - 11:45am Markup to generate markup to generate
markup
Peter Flynn, Cork University & Silmaril
Consultants LaTeX relies for its extensibility on a library of over four
thousand packages and document classes, which provide additional markup, additional
layouts, and variant behavior. The ltxdoc class
supplies features for maintaining these packages and classes in a literate programming
style: code (with comments) and full end-user documentation derive from a single source.
The syntax of ltxdoc, however, is complex, as
documentation must be shielded from interpretation as code, and vice versa. The system
presented here is an experiment in using XML (specifically DocBook 5) to mark up and
maintain LaTeX classes and packages. XSLT 2 styleheets generate the .dtx and .ins
distribution files expected by end users. There are numerous benefits in automation and
reusability of code, a number of areas where a customization layer for DocBook would be
useful, and a few unresolved restrictions that package and class authors and maintainers
would need to keep in mind when editing XML.
|
Tuesday 11:45 am - 12:30 pm The FtanML markup language
Michael Kay, Saxonica What if we
could reinvent markup anew? What if XML and JSON were ‘the ones we built to throw
away’ and we could start over to build our markup? Can we imagine what the world
would be like if we didn’t have to design with an eye to the past? Michael Kay and
Stephanie Haupt explored these questions with some advanced students at a Swiss summer
program in 2012. That summer is over, but the exploration continues. FtanML is a rethink of
markup from the ground up, with an associated schema language FtanGram and a
query/transformation language FtanSkrit. What can this more-than-thought experiment teach
us all?
|
Tuesday 2:00 pm - 2:45 pm First Person: The
allure of Gothic markup
Simon St. Laurent, O’Reilly Media John Ruskin, markup theorist as well as art critic? Perhaps not, but we who practice document architecture can learn from his analysis of the Gothic way. The history of markup languages is littered with formalisms and schema languages designed to constrain and validate documents. Consider instead a markup community where documents don’t need to be restricted but are instead adaptable to customization for individual needs. The flexibility of Web tools, combined with implementation advice from architect Christopher Alexander, may serve some communities better than the rigidity of traditional markup approaches. Let’s try building Gothic cathedrals that allow for individual creativity rather than Brutalist apartment blocks of mass-produced documents.
|
Tuesday 2:45 pm - 3:30 pm Programming in XPath 3.0
Dimitre Novatchev, Intentional Software It’s a bird! It’s a plane! XPath is critically at the intersection of XQuery and XSLT. It is also the expression language of a number of other XML technologies. It is not, however, a full-fledged programming language. Or is it? Historical limitations in XPath 1.0 and 2.0 may have made them a weak substitute for a full-fledged programming language, but what about XPath 3.0? XPath 3.0 is the ideal language to write function libraries designed to be used both in XSLT and XQuery. Possessing variables, inline functions, closures, a simple mapping operator, strong typing, robust recursion, and the ability to create nodes, XPath 3.0 can truly sport an “S” on its chest.
|
Tuesday 4:00 pm - 4:45 pm (LB) Marking up changes to ISO standards: A case study
Tristan Mitchell,
Nigel Whitaker, both of DeltaXML The ISO Standards Tags Set (ISOSTS) is a customization of NISO's Journal Article Tag Suite (JATS) developed for the International Standards Organization for authoring standards documents. As part of the authoring workflow used at ISO, they need to produce redline publications of a document in order to show changes between versions of a standard. Alongside Typefi, who provided the functionality for publishing the marked XML into PDF with changebars, we provided our XML comparison toolset to detect and mark the changes as required.
The issues we faced while completing this work include representation of changes in the XML, comparison of tables, ignoring text formatting changes, and the use of processing instructions. We discuss the pros and cons of various format design decisions that can impact good comparison.
|
Tuesday 4:00 pm - 4:45 pm Hypermedia services, loosely coupled
Jonathan Robie,
Rob Cavicchio,
Rémon Sinnema, &
Erik Wilde, all of EMC Your
service has endpoints, my service has endpoints, but how do we communicate? We can document
in great detail all the URIs and parameters involved in our services; this approach treats
them like function signatures and invites the construction of tightly coupled services. But
depending on out-of-band information to drive interaction runs against the grain of truly
RESTful services. Instead, the interaction between services should be expressed by
documenting the processing rules for the media types of the representations exchanged. By
moving to a new description language we can make loosely coupled web services a thing of
the present!
|
Tuesday 4:45 pm - 5:30 pm (LB) What it is vs. how we shall: complementary agendas for data models and architectures
David Dubin, Megan Senseney, & Jacob Jett all of University of Illinois
Data models play two kinds of role in
information interchange. Descriptively, they offer domain models. Prescriptively, they propose plans of action. While stipulative definitions fill in a model's representational gaps, elegance and descriptive power reduce the need for arbitrary choices in standards and specifications. Proposals for modeling digital annotation serve to illustrate competing representational and cohortative agendas.
|
Tuesday 4:45 pm - 5:30 pm My document object model can do more than
yours
Alain Couthures, AgenceXML Document
object models, specifically the browser DOM, were designed to represent HTML and XML
documents. Languages such as XPath were designed to access and traverse the DOM of HTML and
XML documents. But suppose we wanted to bring the power and convenience of XML technologies
like XPath to new data types. Could we extend the DOM to support CSV files? JSON? ZIP
files? Yes we can! This paper explores a number of ways in which the DOM
can be made to do more. We can loosen restrictions, describe new sequence types, and even
define new XPath axes to make the DOM better and more useful.
|
Wednesday, August 7, 2013
|
Wednesday 9:00 am - 9:45 am Processing XForms in HTML5-enabled
browsers
Tobias Niedl &
Anne Brüggemann-Klein, both of
Technische Universität München Forms technology for the World Wide Web has developed along two lines. The XForms
strain has worked for a cleaner separation of concerns and supports more complex
bindings between user interface and data. The HTML strain has focused on the user
interface, defining new widgets and in HTML5 adding type definitions to form
elements to enable native in-form validation. Some XForms implementations translate
XForms elements into HTML widgets plus executable code. But HTML5 also defines new
Javascript APIs browsers should support. The new facilities of HTML5-enabled
browsers can be used to support XForms near-natively. We explain how.
|
Wednesday 9:45 am - 10:30 am The case for authoring and producing books
in (X)HTML5
Sanders Kleinfeld, O’Reilly Media Publishers find themselves caught between the steep learning curves and
rigorous validation of production systems like DocBook and the anything-goes approach of
the commercial word processors used by many of their authors. Can publishing requirements,
particularly for electronic output, be met by simple online tools that are able to produce
structured output without punishing authors? The resilience of Web browsers suggests that
an HTML-based solution might be promising. A proposal for an HTMLBook standard applies new
rules and semantics to (X)HTML5 to create a format that is easy to edit while also being
ready to produce output.
|
Wednesday 11:00 am - 11:45 am Semantic profiling using indirection
Ari
Nordström, Condesign Profiling is a publishing technique in which
portions of the content are identified as relevant to various conditions. Publications are
created by selecting the appropriate portions. In XML this is often implemented by
marking nodes using attribute values as filtering conditions. When publishing, the nodes
are only included if the publishing conditions match the publishing context. The profiles
are sometimes also used as the basis for text generation. While useful, these techniques
have a number of problems. For example, if the attribute values need to be changed, the new
values usually require converting any “live” legacy documentation to the new
values and changing the schema, stylesheets, etc. Supporting both the old and new profiles
will not be possible. An abstraction or indirection layer solves this. The profile values
are not used directly; instead they represent a specific “semantic profile”.
The abstraction layer can be expressed using URNs that are matched to human-readable values
when required.
|
Wednesday 11:45 am - 12:30 pm The XML info space
Hans-Jürgen Rennau, Traveltainment XML-related standards imply an architecture of distributed information which integrates all accessible XML resources into a coherent whole. Attempting to capture the key properties of this architecture, the concept of an info space is defined. The concept is used as a tool for deriving desirable extensions of the standards. The proposed extensions aim at a fuller realization of the potential offered by the architecture. Main aspects are better support for resource discovery and the integration of non-XML resources. If not adopted by standards, the extensions may also be emulated by application-level design patterns and product-specific features. Knowledge of them might therefore be of immediate interest to application developers and product designers.
|
Wednesday 1:15 pm - 2:00 pm Balisage Bluff: An Entertainment
Balisage Attendees
Balisage Bluff: markup-truth may be stranger than fiction! Participants will listen to short stories that
involve markup, Montreal, or have some other connection to the conference. The
audience will be challenged with identifying which stories are true (or close to it) and which are mostly fabricated.
Do you have a story to tell? Stories will be limited to 2 minutes, but even so there are al lot of Balisageurs with great tales to tell. Volunteer by sending email to info@balisage.net, or by talking with Lynne Price, gamemaster, on site in Montreal. If there are more than ten volunteers by July 15, ten will be randomly selected. If we have more time in the actual session volunteers will be recruited from the audience/participants.
|
Wednesday 2:00 pm - 2:45 pm First Person: Where
did all the document kids go?
Matt Patterson, Constituent Parts When I began developing with XML technologies, there were a multitude of toolkits and
implementations of XML parsers, multiple DOM (and DOM-like) implementations outside web
browsers, and XSLT implementations everywhere. My current development environments seems
impoverished in comparison. What happened? The population of web development tools, by
contrast, has grown by leaps and bounds. Why is the one ecosystem contracting and the other
growing? One should never underestimate the power of making things more accessible to the
casual user.
|
Wednesday 2:45 pm - 3:30 pm Invisible XML
Steven Pemberton, CWI, Amsterdam What if you could see everything as XML? XML has many strengths for data exchange,
strengths both inherent in the nature of XML markup and strengths that derive from the
ubiquity of tools that can process XML. For authoring, however, other forms are preferred:
no one writes CSS or Javascript in XML. It does not follow, however, that there is no value
in representing such information in XML. Invisible XML is a method for treating non-XML
documents as if they were XML, enabling authors to write in a format they prefer while
providing XML for processes that are more effective with XML content. There is really no
reason why XML cannot be more ubiquitous than it is.
|
Wednesday 4:00 pm - 4:45 pm Collecting and curating slide sets
Alan Bilansky, University of Illinois at Urbana-Champaign Enormous numbers of presentations are created in
PowerPoint, Open Office, KeyNote, and similar slideware every day. These slide decks are
emailed, posted on the web, shared, and stored on filesystems throughout industry and
academia. And yet, unlike many other cultural artifacts, we have no systematic way to
archive and curate them. What are the semiotics of slides? Should we be archiving and
curating the various aspects of slide decks: graphics, diagrams, photographs, and words,
and the transitions between them? If so, how should this information be captured? The time
has come! Save the decks! (Proceedings)
|
Wednesday 4:00 pm - 4:45 pm Using XSD import, include, and redefine in
the MailXML logistics system
Dianne Kennedy, IDEAlliance The
United States Post Office needs flexible means for exchanging messages with its logistics
clients. MailXML, developed in collaboration with IDEAlliance, has evolved into a flexible
suite of XSD schema modules. These modules have been constructed to exploit XSD's
facilities for redefinition of components. The framework, in which new modules can be
prototyped and added to the system without disrupting current services, has potential uses
beyond its original application.
|
Wednesday 4:45 pm - 5:30 pm Igel: Comparing document grammars using XQuery
C. M. Sperberg-McQueen, Black Mesa Technologies, and
Oliver Schonefeld, Marc Kupietz,
Harald Lüngen, &
Andreas Witt all of
the Institut für Deutsche Sprache (IDS)
Igel is a small XQuery-based web application for examining a
collection of document grammars; in particular, for comparing related
document grammars to get a better overview of their differences and
similarities. In its initial form, Igel reads only DTDs and provides
only simple lists of constructs in them (elements, attributes,
notations, parameter entities). Our continuing work is aimed at making
Igel provide more sophisticated and useful information about document
grammars and building the application into a useful tool for the
analysis (and the maintenance!) of families of related document
grammars.
|
Wednesday 4:45 pm - 5:30 pm (LB) A data-driven approach using XForms for building a web forms design framework
Stephen Cameron, Collinta &
William Velasquez, visiontecnologica.com
In a project to build a web-based auto-generated forms framework, we needed to decide whether to use XForms for the 'designer' user-interface as well as for the generated forms. In trying to make this decision it became apparent that, at a fundamental level, there are two distinctly different means to develop web-based interfaces. These two means, or approaches, can be described as 'data-driven' or 'behavioural'. We suggest that the Model-View-Controller (MVC) design 'pattern', which is now becoming popular as terminology for describing the basis of several JavaScript web development frameworks, is of limited practical usefulness as it encompasses too many variants. In contrast, the distinction between 'data-driven' and 'behavioural' approaches seems to be a more useful. In particular, it provides clarity in distinguishing the respective benefits of using 'XML technologies' (particularly XPath) versus other object-based alternatives for web application development. This distinction is illustrated using working examples from this on-going project. Some implications, such as the role of schema documents in the data-driven approach, the practicality of writing XML 'as code', and issues encountered with the 'XRX' architecture are also discussed.
|
Thursday, August 8, 2013
|
Thursday 9:00 am - 9:45 amIndexing queries in Lux
Michael Sokolov, Safari Books Online Query optimizers often mystify database users: sometimes queries run quickly
and sometimes they don’t. An intuitive grasp of what will work well in an optimizer
is often gained only after trial, error, inductive logic (i.e. educated guessing), and
sometimes propitiatory sacrifice. This paper tries to lift the veil by describing work on
Lux, a new indexed XQuery search engine built using Saxon and Lucene, which is freely
available under an open-source license. Lux optimizes queries by rewriting them as
equivalent (but usually faster) indexed queries, so its results are easier for a user to
understand than the abstract query plans produced by some optimizers. Lucene-based QName
and path indexes prove useful in speeding up XQuery execution by Saxon.
|
Thursday 9:00 am - 9:45 am An extensible API for documents with multiple annotation layers
Nils Diewald & Maik
Stührenberg, both of Universität Bielefeld XML
namespaces and standoff annotation are promising approaches to tackle overlapping multiple
annotation layers in XML instances. But the creation and processing of standoff instances
can be cumbersome, especially when the underlying text is modified after an annotation has
been added. We present a powerful API capable of dealing with these tasks. It provides an
extension mechanism to allow for the easy creation of modules corresponding to a certain
namespace (and therefore markup language). As a working example, we use XStandoff (which combines standoff notation with a GODDAG-like data model), since it is a standoff
format highly dependent on XML namespaces.
|
Thursday 9:45 am - 10:30 am (LB)
Sociology, history and overview of Rights Metadata Standards
Linda Burman, L. A. Burman Associates
Rights metadata has recently become a very hot topic. While rights management (copyright law) has been discussed and debated for many years, the term “rights management” has many different meanings to people in different roles and is applied to a wide variety of behaviors, workflows and systems. In publishing, rights have typically been the domain of legal staff and documented in paper contracts. Unsurprisingly, ‘uptake’ of rights metadata standards has been slow.
However, now that digital asset and/or content management systems are becoming ubiquitous, users have immediate access to their digital assets and want to know what content they can repurpose and under what restrictions – platform, media type, distribution channel, geography and so on - without having to phone a permissions manager for every question. Suddenly rights metadata vocabularies are becoming extremely useful. Unfortunately, most companies are still unaware that rights metadata standards -- PRISM Usage Rights (PUR), PLUS (UsePlus), and ODRL-- already exist. This overview of the existing standards will emphasize each one’s point of view – the type of content and company each is best suited for and being used by, and how to learn more about each standard.
|
Thursday 9:45 am - 10:30 am Modeling overlapping structures: Graphs and
serializability
Yves Marcoux, Université de Montréal;
C.M. Sperberg-McQueen, Black Mesa Technologies; &
Claus Huitfeldt, University of Bergen, Norway Modeling overlapping structures (e.g. verse and line structures in poetry) in
an XML environment which represents only cleanly embedded structures, is a familiar
problem. Proposals to address this problem include XML solutions (based essentially on a
layer of semantics) and non-XML ones such as TexMecs, a markup language that allows overlap
(and other features). Overlap-only TexMecs documents have been shown to correspond to
completion-acyclic node-ordered directed acyclic graphs. Elaborating on that result, we
cast it in the setting of a strictly larger class of graphs, child-ordered directed graphs
(CODGs), that includes multi-graphs and non-acyclic graphs, and show that — somewhat
surprisingly — it does not hold in general for graphs with multiple roots. Second,
we formulate a stronger condition, full-completion-acyclicity, that guarantees
correspondence with an overlap-only document, even for graphs that have multiple roots.
Full-completion-acyclicity can be checked with polynomial-time algorithms, which can
compute oo-serialization of fully-completion-acyclic CODGs.
|
Thursday 11:00 am - 11:45 am (LB)
Could authors really write in XML one day?
Peter Flynn, Cork University & Silmaril
Consultants
The learning curve for non-markup-expert authors to start writing and
editing structured documents in XML is steep, and there are some
specific barriers to the acceptance of editor interfaces. In
exploring the reasons behind these barriers, we identified some changes
that could be made to common interfaces to improve acceptability.
This paper presents the results of usability tests on the modifications,
and suggests how some aspects of structured editing software could be
adapted to extend their use into additional areas and markets.
|
Thursday 11:00 am - 11:45 am Poio API and GraF-XML
Jonathan Blumtritt, University of Cologne; Peter Bouda, Centro Interdisciplinar de
Documentação Linguística e Social; & Felix
Rau, University of Cologne Language documentation projects all over the world have accumulated a large and heterogeneous corpus of linguistic material. Because of its diversity, access to and analysis of the components is difficult, particularly for multimedia instances. The "Graph Annotation Framework" (GrAF), a standoff annotation method, is applied to utterance examples in time-aligned annotations of video samples. An easy-to-use programming interface defined in the Poio API, a project within the CLARIN framework ("Common Language Resources and Technology Infrastructure"), then greatly simplifies access without the need to deal with multiple input formats in the source material. GrAF-XML provides a basis for exchanging results among the various projects that analyze the corpus. (Proceedings)
|
Thursday 11:45 am - 12:30 pm Where are all the bugs? Introspection in
XQuery
Mary Holstege, MarkLogic In a large
and complex code base, it is infeasible to develop tests manually for every feature and
every combination of features. The key to quality assurance in this context is automation
and focus. Using XQuery introspection, one can examine a large XQuery code base to find
smart places to focus testing. Based on the
proposition that the set of functions, and the sequence types of parameters used by those
functions, constitute vocabularies following classic Zipf distributions, we show that
TF-IDF scoring over the terms in those vocabularies identifies areas of potential testing
interest.
|
Thursday 2:00 pm - 2:45 pm
Andreas Tai, Institut für
Rundfunktechnik, München A clash of the “web culture”
with the “XML culture” has resulted in a divergence in standards development
for timed text and the more general domain of subtitling. Timed Text Markup Language
(TTML), released by the W3C in 2010, has been rejected by the WHATWG in favor of the
text-based format WebVTT. The broadcast community prefers the XML-based TTML, but the
roll-your-own faction among Web developers has pushed for WebVTT. Is this a symptom of a
divergence of the Web world from the world of markup that gave it birth? Stay tuned for
more!
|
Thursday 2:45 pm - 3:30 pm
(LB)
Interactive XSLT in the browser
O'Neil Delpratt & Michael Kay, both of Saxonica Remember the dream of being able to process XML in the browser to write richly interactive applications? It's taken a long time coming, and a lot of people have given up waiting, but it is now a reality. With the open-source Saxon-CE engine, you can now write highly interactive applications in the browser to process XML content, without writing a single line of Javascript. As a bonus, you get all the benefits of XSLT 2.0. During this talk we will demonstrate what can be achieved. And because Balisage audiences are interested in the theory as well as the practice, we'll also touch on some of the underlying concepts: how does one use a purely functional language to manipulate a stateful interactive dialogue with the user?
|
Thursday 4:00 pm - 4:45 pm The New W3C Publishing Activity
Liam Quin, W3C
The W3C is involving publishers and people and organizations who provide
tools for publishers in an effort to change the Web so that it's suitable for publishing.
The Open Web Platform is changing the ways people do things. Proprietary
desktop tools are being replaced by Web-based applications. At the same
time ebooks are forcing publishers to come to terms with producing
multiple output formats from their assets, so that "XML Early" and "XML
First" are hot buzzwords in the industry. The EPUB3 format, defined by
IPDF, uses XHTML and CSS, W3C Web technologies. The Open Web Platform doesn't meet the needs of publishers today.
So W3C is working more closely with IPDF, with publishers and designers,
and others, to change the Web so that it's suitable for publishing.
Technical work on CSS has already begun and W3C is looking at internationalization, HTML, metadata, and workflow.
|
Thursday 4:45 pm - 5:30 pm Transcending triples in semantic modeling
Micah Dubinko, MarkLogic
Can't documents and triples all just get along? Proponents of RDF semantic modeling often see the world through triples-shaped glasses. If they can't do it in triples, they don't know what to do: reification in particular has terrifying implications. Broadening a view of inference to transcend triples might overcome the constraints of triple-vision and point the way toward future solutions.
|
Friday, August 9, 2013
|
Friday 9:00 am - 9:45 am (LB)
General Architecture for Generation of Slide Presentations, including PowerPoint, from arbitrary XML Documents
Eliot Kimber, Contrext
PowerPoint slide decks are often required for training content authored in XML.
Until recently, this was difficult for many users.
With the development of the Apache POI library, it is now possible to
reliably generate PowerPoint documents with a minimum of implementation
effort. This paper presents a general architecture for generating slide
presentations of any format from XML of any sort through the use of an
intermediate format that abstracts the general structure of PowerPoint-type
presentations. This general architecture allows the same source to
potentially produce PowerPoint, Slidey, PDF, or any other
presentation-optimized format from the same source with a minimum of
implementation effort. The paper focuses on the specific challenge of
producing PowerPoint using this architecture.
|
Friday 9:45 am - 10:30 amWhat, when, where? Spatial and temporal
annotations with XStandoff
Maik Stührenberg, Universität Bielefeld
In annotating non-textual sources: maps, images, and motion pictures, common practice employs standoff markup. XStandoff has been successfully used to annotate multiple hierarchies largely based on textual primary data. We have now extended it to support both spatial annotations (for images and maps) and spatial-temporal annotations (for video files). Practical applications might range from simple image feature extraction to something as complex and dynamic as representing three-dimensional eye-tracking data.
|
Friday 9:45 am - 10:30 am Fat Markup: Trimming the myth one calorie at
a time
David Lee, MarkLogic JSON is lean
and XML is fat — or so say some factions in the online community. Does this hold up
in the real world? Tests of a corpus of several dozen varied documents, using a variety of
browsers on many operating systems show that care in markup design and choice in processing
methods (for example, direct JavaScript vs. jQuery) may have more effect on speed and
throughput than the actual markup language chosen. The myth that XML is more fat than JSON
may belong in the same category as an assertion that the “<” and
“>” characters are larger then the “{” and “}”
characters due to their excessive pointiness.
|
Friday 9:45 am - 10:30 am Some assembly required: XML semantics,
digital preservation, and the construction of knowledge
Jerome McDonough,
University of Illinois at Urbana-Champaign If we think of meaning as emerging primarily from the
interaction between a text, its markup, and a reader, we may be missing other influences on
meaning that operate at a larger scale than a single document or even a collection. For the
past six years, the “Preserving Virtual Worlds” projects have been
investigating the preservation of computer games and interactive fiction. For computer
games, identifying and collecting information is not simply an issue of documenting a
particular file format; it becomes an exercise in knowledge representation and management.
If highly complex, multimedia objects such as game software are going to survive in the
long-term, archivists will need to collect, organize and preserve not just the objects that
comprise the game, but a large body of information necessary to interpret those objects.
Somehow, they will need to preserve people’s ability to understand the game.
|
Friday 11:00 am - 11:45 am Decision making in XSL-FO formatting
Tony Graham, Mentea XSL-FO 1.0 and
1.1 share a very linear processing model that makes it difficult to use the results of one
formatting task to make layout decisions in other formatting tasks. But in real
composition, good layout often requires the ability to postpone decisions for best fit,
particularly when positioning tables and graphics. In the past, multi-pass workarounds have
been necessary to allow decisions to depend on the sizes of objects in the formatted
output. The requirements for XSL-FO 2.0 included many of the necessary abilities, but the
Working Group’s charter expired before XSL-FO 2.0 was completed, leaving the
specification unfinished. The Print and Page Layout Community Group at the W3C is working
on innovative solutions to many of the delayed decision-making problems.
|
Friday 11:00 am - 11:45 am Musical variants: Encoding, analysis and visualization
Johannes Kepper, University of Paderborn, Germany; Perry Roland, University of Virginia Library; &
Daniel Röwenstrunk, University of Paderborn,
Germany Variation in music, like variation in texts, reflects the history
of the cultural artifact. We propose a model for the encoding of variance in music that is
based on traditional models and implemented using the data framework offered by the Music
Encoding Initiative (MEI). Our model can be used to identify a portion of the musical text
that varies among different sources, to identify the relations between sources (with their
directionalities), and to illustrate the relationships between the encoded sources. The
model is aligned with the Functional Requirements for Bibliographic Records (FRBR) and can
be used to provide an overview of multiple variant sources and to inform an editor’s
interpretation of the overall connections. The Freischütz Digital project, which will
create a digital scholarly edition of Carl Maria von Weber’s opera Der Freischütz based on encodings of all relevant
sources.
|
Friday 11:45 am - 12:30 pm Markup and Canada’s national model
building codes
Brent Nordin
Canada’s Building Codes are large, typographically complex
documents that cover building and occupant safety for areas as
diverse as plumbing, fire, and energy, as well as buildings.
Over the years these documents have moved from typewriters, to
desktop publishing, to SGML and Dynatext, to Arbortext XML with
output on HTML and NXT CDs, and finally to the bright present. We have added a CMS to support life-cycle management of revisions
and integrated the CMS with our XML library. Our current output formats of
PDF and HTML show change tracking for revised material, offer
side-by-side rendering in French and English, and more. In this case
study, I share the solutions, tools, frustrations, and
triumphs.
|
Friday 11:45 am - 12:30 pm (LB)
Transforming schemas: architectural forms for the 21st Century
John Cowan, LexisNexis
XML documents are typically transformed in three steps: validate,
transform, validate. Architectural forms, a feature of the SGML-based
hypermedia standard HyTime, uses a combination of enhancements to DTDs
and annotations in source documents to allow a two-step pipeline, whereby
an SGML document could be automatically transformed using a specialized
SGML parser, called an architectural engine, into another SGML document
valid against a more general DTD known as the meta-DTD. This permitted
document creators to conform to a general document architecture without
having to constrain their own documents to every detail of a specific
schema. Unfortunately, DTDs have not seen wide uptake in the XML
world, and the few XML architectural engines that have been built have
conformed more to the letter than to the spirit of architectural forms.
Instead, the emphasis has been on the creation of comprehensive and
complex schemas which attempt simultaneously to serve local needs and
the needs of interchange. This work is an attempt to provide a modern
architectural forms engine for documents described using the Examplotron
schema language.
|
Friday 12:30 pm - 1:15 pm Climbing the hill
C. M. Sperberg-McQueen, Black Mesa Technologies Notes on making things better and on getting from here to where we want to
be.
|
There is nothing so practical as a good theory
|