How to cite this paper
Jordanous, Anna, Alan Stanley and Charlotte Tupman. “Contemporary transformation of ancient documents for recording and retrieving maximum
information: when one form of markup is not enough.” Presented at Balisage: The Markup Conference 2012, Montréal, Canada, August 7 - 10, 2012. In Proceedings of Balisage: The Markup Conference 2012. Balisage Series on Markup Technologies, vol. 8 (2012). https://doi.org/10.4242/BalisageVol8.Jordanous01.
Balisage: The Markup Conference 2012
August 7 - 10, 2012
Balisage Paper: Contemporary transformation of ancient documents for recording and retrieving maximum
information: when one form of markup is not enough
Anna Jordanous
Centre for e-Research, King’s College London, UK
Alan Stanley
University of Prince Edward Island, Canada
Charlotte Tupman
Department of Digital Humanities, King’s College London, UK
Copyright © 2012 by the authors. Used with permission.
Abstract
This paper considers what we can gain from enhancing TEI-encoded texts with RDF. We
consider the use of Open Annotation Collaboration (OAC) annotations as part of our
work for
the future. To illustrate our approach, we take as a case study the Sharing Ancient
Wisdoms
(SAWS) project, which explores and analyses the tradition of wisdom literatures in
ancient
Greek, Arabic and other languages. It aims to publish its texts digitally in a manner
that
enables linking and comparisons within and between anthologies, their source texts,
and the
texts that draw upon them.
Table of Contents
- Introduction
- The Sharing Ancient Wisdoms (SAWS) use case: sources and materials
- Identifying and extracting the required data for SAWS
- Background: Previous TEI and RDF combinatory approaches
- Automatic extraction of information from TEI documents
- Use case implementation: illustrating the SAWS usage of TEI and RDF
- Examples of transformations from TEI to RDF for the SAWS use case
- Resulting benefits for information exploration and retrieval in the SAWS project
- Evaluation of the SAWS implementation
- Future work
- Acknowledgements
Introduction
In this paper we primarily consider what we can gain from enhancing TEI-encoded texts
with
RDF, though there are other choices of re-representation which could also be profitable
in the
future. We consider the use of OAC annotations as part of our work for the future.
To
illustrate our approach, we take as a case study the Sharing Ancient Wisdoms (SAWS) project, which explores and analyses the tradition of wisdom literatures in
ancient Greek, Arabic and other languages. Our methods for representing semantic links
within
and between specific sections of these texts, and describing the relationships that
exist
between them in a systematic way, are documented and explained. We consider that this
approach
has the potential to be used widely to link and describe related sections of a variety
of
different types of texts. Given the common practice of publishing TEI documents as
part of
Digital Humanities research output, our central contribution is to demonstrate how
the
usefulness of these TEI documents can be developed further in diverse directions,
beyond their
current application for digital edition publication.
The Sharing Ancient Wisdoms (SAWS) use case: sources and materials
SAWS
is a key use case for this work, demonstrating a requirement for a markup approach
that encapsulates various types of information, including structural markup and semantic
annotation. The SAWS project aims to present its texts digitally in a manner that
enables
linking and comparisons within and between anthologies, their source texts, and the
texts that
draw upon them. We are also creating a framework through which other projects can
link their
own materials to these texts via the Semantic Web, thus providing a ‘hub’ for future
scholarship on these texts and in related areas. The project is funded by HERA (Humanities
in
the European Research Area) as part of a programme to investigate cultural dynamics
in Europe,
and is composed of teams at the Department of Digital Humanities and the Centre for
e-Research
at King's College London, The Newman Institute Uppsala in Sweden, and the University
of
Vienna.
Throughout antiquity and the Middle Ages, anthologies of extracts from larger texts
containing wise or useful sayings were created and circulated widely, as a practical
response
to the cost and inaccessibility of full texts in an age when these existed only in
manuscript form. SAWS focuses on gnomologia (also known as florilegia), which are manuscripts that
collected moral or social advice, and philosophical ideas, although the methods and
tools
developed are applicable to other manuscripts of an analogous form (e.g. medieval
scientific
or medical texts).
The key characteristics of these manuscripts are that they are collections of smaller
extracts of earlier works, and that, when new collections were created, they were
rarely
straightforward copies. Rather, sayings were selected from various manuscripts, reorganised
or
reordered, and subtly (or not so subtly) modified or reattributed. The genre also
crossed
linguistic barriers, in particular being translated from Greek into Arabic, and again
these
were rarely a matter of straightforward translations; they tend to be variations.
In later
centuries, these collections were translated into western European languages, and
their
significance is underlined by the fact that Caxton’s first imprint (the first book
ever
published in England) was one such collection. Thus the corpus of material can be regarded as a very complex directed network or
graph of manuscripts and individual sayings that are interrelated in a great variety
of ways,
an analysis of which can reveal a great deal about the dynamics of the cultures that
created
and used these texts.
Identifying and extracting the required data for SAWS
TEI traditionally excels in areas such as text structure definition and document metadata,
and although it possesses the means to identify and define semantic relationships
between
sections of text, none of these methods has, as far as we have been able to determine,
been
adopted widely or used as a standard mechanism for recording the nature of the relationship
between texts. For instance we could use <ref target="...">
to point to another section
of text, but we would need to modify the schema to require that @type
should appear, allowing
us to insert a description of the relationship between these two sections. Another
possibility
would be to use an <interp>
element with an @xml:id
attribute that contained the
required relationship: its @inst
attribute could then be used to point to another section of
text. However, the insertion of an attribute detailing the source of the asserted
relationship
(i.e. the person or bibliographic source responsible for making the assertion) is
also vital
to SAWS: we need to be able to trace the scholarly source of that link. The <relation>
element, which is a recent addition to the TEI, provides us with the ability to include
all of
the desired information within one element: the ID of the section of text being linked
from;
the ID of the section of text being linked to; the nature of the relationship between
the two
sections; and the identity of the source responsible for making the assertion. Our
use of the
<relation>
element is discussed fully below (see ‘Use case implementation: illustrating
the SAWS usage of TEI and RDF’), but it is worth noting in this introductory section
the
important point that the use of <relation>
allows us to enter RDF directly into the TEI
document (i.e. the triples we are defining about the sections of text and their relationships)
and to combine this with information about scholarly responsibility, all within one
element.
This is particularly useful when the data is being entered by scholars who are familiar
with
TEI encoding and are marking up the rest of their documents in TEI, but who do not
have any
training in RDF. Being able to enter the RDF data directly into the TEI document means
that
they do not have to learn a second set of skills, while at the same time we can make
use of
the advantages of RDF (see below, ‘Resulting benefits for information exploration
and
retrieval in the SAWS project’).
These types of semantic relationships within and between texts are particularly important
to the understanding of how themes and ideas were transmitted between cultures, and
across
languages and time. As an example use case, in the SAWS project a key point of interest
to our
manuscript scholars is to represent relationships within and between different collections
of
wise or moral sayings, and to investigate how these collections have been referred
to, amended
and/or passed on from manuscript to manuscript. We want to record and visualise the
links
within and between these collections; from these collections to their source texts
(e.g.
Aristotle’s writings); and from these collections to their recipient texts (e.g. the
11th-century Strategikon of Kekaumenos, as well as later texts). Critically, we want
to do
this in a way which can be repeated by others, so that our collection of texts acts
as an
example and starting point for a larger enterprise taking this approach beyond our
project
alone.
At the moment, scholars of gnomologia and their related
texts tend to work from manuscripts and printed editions, and the links between the
texts they
are working on are recorded within commentaries and footnotes. Sometimes their editions
will
include studies of the relationships between specific manuscripts: for instance, a
discussion
of the transmission of a particular work through a number of different manuscripts.
What SAWS
will provide is the ability for scholars to investigate much more deeply the relationships
between specific sayings within those texts, and to follow those links through a number
of
different variants and languages. This is achieved by enabling identification and
annotation
of relationships by different scholars within a ‘hub’ that will provide visualisations
of
those relationships as well as direct links to the texts concerned (or in the case
of texts
that are not digitised, a URI for that text). Scholars interested in a particular
saying or
set of sayings will immediately be able to see both the fact that the saying is related
to
sayings within other texts (each of these identifiers will be displayed to them, with
a
clickable link to that text), and will also see a description of the nature of the
relationships that have been identified. They will also be able to view who has asserted
that
relationship, and can add their own assertions or notes as desired.
As an illustration of why this is important for textual scholars, consider this saying
from Gnomologium Vaticanum (no. 87):
Ὁ αὐτὸς ἐρωτηθεὶς τίνα μᾶλλον ἀγαπᾷ, Φίλιππον ἢ Ἀριστοτέλην, εἶπεν· “ὁμοίως ἀμφοτέρους·
ὁ
μὲν γάρ μοι τὸ ζῆν ἐχαρίσατο, ὁ δὲ τὸ καλῶς ζῆν ἐπαίδευσεν.”
Alexander, asked whom he loved more, Philip or Aristotle, said:
”Both equally, for one gave me the gift of life, the other taught me to live the virtuous
life.
We can identify that this saying (i.e. section of text) exists in various forms in
earlier
works, and that there are relationships that can be defined between our first example
and
those below (and indeed between the various examples below):
Plutarch, Life of Alexander 8.4.1:
Ἀριστοτέλην δὲ θαυμάζων ἐν ἀρχῇ καὶ ἀγαπῶν οὐχ ἧττον, ὡς αὐτὸς ἔλεγε, τοῦ πατρός,
ὡς δι'
ἐκεῖνον μὲν ζῶν, διὰ τοῦτον δὲ καλῶς ζῶν ...
Alexander admired Aristotle at the start and loved him no less, as
he himself said, than his own father, since he had life through his father but the
virtuous
life through Aristotle …
Diogenes Laertius 5.19, Life of Aristotle:
Tῶν γονέων τοὺς παιδεύσαντας ἐντιμοτέρους εἶναι τῶν μόνον γεννησάντων· τοὺς μὲν γὰρ
τὸ
ζῆν, τοὺς δὲ τὸ καλῶς ζῆν παρασχέσθαι.
Aristotle said that educators are more to be honored than mere
begetters, for the latter offer life but the former offer the good life.
Pythagoras? Selections from the Sayings of the Four
Philosophers: (B) Pythagoras saying 18 (ed. Gutas):
وقال الآباء هم سبب الحياة والحكماء هم سبب صلاح الحياة
He said: Fathers are the cause of life, but philosophers are the
cause of the good life.
We can see clearly that these four sayings are related to one another in various ways,
but
that there are complexities between these texts that need to be described and documented
(and
ideally visualised) if we are going to be able to trace these relationships in a systematic
way.
In the last example above, we can see that the saying has been attributed to a different
author (Pythagoras), rather than being associated with Aristotle or his pupil Alexander:
alternative attributions are a common feature of this type of text, and they add another
layer
of complexity to the types of relationship that need to be defined.
In our TEI document, therefore, we need to be able to:
-
insert links between these sections of text (which may or may not already be
published digitally);
-
make scholarly assertions in a systematic way about the nature of the (often
complex) relationships between these texts.
In order to achieve these aims, we have chosen to enhance our TEI with RDF. RDF provides
an ideal way to store and manipulate our relationship data: each of the sayings can
be linked
to other relevant sections of text by means of a subject-predicate-object relationship
that is
defined as part of an ontology, which acts as an authority list. One of the main advantages
of
the ontology for the SAWS project is that it ensures consistency of description across
texts
that can vary greatly in their nature, but interestingly it has also acted as a means
of
stimulating scholarly discussion about the nature of the relationships and the ways
in which
they should be described. The textual scholars involved in the project have found
that the
necessity to be completely explicit about their decision-making processes and definitions
has
prompted them to identify, and describe concisely, new relationships that exist within
and
between their texts.
The way in which we are implementing the use of RDF within our TEI documents will
now be
described, and will be followed by specific examples from our SAWS texts to illustrate
how
this is being put into practice.
Background: Previous TEI and RDF combinatory approaches
We would like to be able to use RDF-like syntax to mark up information of semantic
interest such as relations between the text and links to external entities, supported
by a
relevant vocabulary. Whilst RDFa allows RDF to be directly encoded in markup documents,
it has
been primarily deployed in XHTML documents to date. It would be desirable to extend
the scope
of RDF to a wider scale, and particularly for our purposes (and others) to TEI XML documents, without extensive changes being required to the variant of
XML being used for the source document or to the skills and workflow being used in
the markup
process. This last point is of particular concern for non-technical users of TEI markup:
an
established and growing community, not least given the increasing adoption of TEI
by
humanities scholars for Digital Humanities research. Keeping structural, syntactical and semantic information in the same documents
where possible also makes the process of markup more simple and less error-prone for
non-technical users who wish to mark up documents with their annotations, though it
is
acknowledged that this is not always possible. To date, no method for accommodating
TEI and
RDF in the same document has been adopted as standard by the TEI community, though
several
approaches have recently been offered.
RDFTEF
is a Java-based tool for converting TEI files to a form which can incorporate and
output RDF/XML markup. Based around the Jena framework for semantic web applications, RDFTEF implements a basic ontology for representing structural and syntactical
elements and allows additional ontologies to be added as required. Though SPARQL queries
can
be fashioned to query the resulting RDF, these need to be relatively complex and standard
XML
tools cannot be deployed within the RDFTEF environment. RDFTEF has been criticised as ‘[o]nly a “toy” experiment’ for these limitations and due to its lack of ongoing maintenance (last source code
update 2007). Also, RDFTEF introduces a new stage of work to the existing editing
workflow and
requires extra software to be deployed for and learned by the users. Given the non-technical
nature of the target audience who will be marking up the documents with this semantic
information, this is a significant concern to the SAWS project and potentially hinders
the
adoption of our approach by our target users.
The issues for non-technical users also problematise other interesting approaches,
where
RDFa has been used to encode RDF in a TEI document.
Although the markup process was relatively straightforward, specialised scripts
had to be deployed to extract the RDF information in a form suitable for adding to
a triple
store. Deploying such scripts is non-trivial for non-technical users both in setting
up the
appropriate environment and in executing the scripts. The scripts used by Jewell’s
and
Lawrence’s work were also highly specific to the type of information in those documents,
rather than being more domain-general. These issues with over-specific scripts and
associated
implementation issues were also seen in a similar script-based approach to automated
creation
of RDF triples from TEI documents, in work performed by the SPQR project. In terms of implementation and re-use, there is a more user-friendly alternative
of transformations through XSLT stylesheets, the execution of which is incorporated
into the
user interface of tools like the Oxygen XML editor. To avoid or at least reduce
over-specificity and encourage re-use of our materials, the adoption of a more generic
underlying model for transformations is an interesting alternative, as is explored
in this
present paper.
Another tool is available to represent document structure(s) with RDF: the EARMARK
OWL ontology
The inclusion of RDF in TEI documents is a current area of interest in the TEI community.
Members of the TEI-Ontologies Special Interest Group (SIG) are using XSLTs to convert TEI to RDF, by relating TEI markup to vocabulary in the
CIDOC-CRM cultural heritage model (a recognised ISO standard: ISO 21127). Some discussion has also been made by the
SIG about the inclusion of FRBRoo (a bibliographical records model harmonised with CIDOC-CRM) in the base vocabulary, however work in this area has not progressed and development has been concentrated
around a TEI-CIDOC harmonisation. This co-operation between TEI and CIDOC-CRM has
been
formally active since the formation of the SIG in 2004 and has seen regular but reasonably
slow-paced development, probably due to the other commitments and geographical displacement of the
researchers involved. Some mappings have been drafted (last updated 2007/8) and stylesheets (last updated 2011) and guidelines (last updated 2010) have been published, but several issues exist that are
hampering the SIG’s progress:
-
The approach taken by the SIG requires some changes to be made to TEI, with new
elements to be added and others to be extended. This raises questions as to the applicability of the resulting stylesheet to
existing and legacy TEI documents.
-
The size of the current TEI P5 tagset, containing hundreds of elements, raises
practical difficulties in providing a comprehensive mapping from TEI to alternative
representations. The TEI ontologies SIG has identified a subset of TEI elements to
map
to CIDOC-CRM, choosing only elements which represent semantically meaningful elements
within the text, “elements such as persons, places, dates and events”. This approach is practical but disregards many triples of potential interest
within the TEI markup such as document structure and metadata. It also limits the
scope
of output triples to only those elements encodable using TEI markup, such as names
of
places and people.
-
It is questionable whether CIDOC-CRM is the best choice of vocabulary to be used for
modelling textual document information, especially as its only direct representation
of
lexical material is through one class (E33 Linguistic Object) and its two subclasses
(E34 Inscription, E35 Title). This choice of CIDOC as base model is acknowledged to
be
influenced by the research interests of the SIG members in cultural heritage and museum
documentation. Particularly for metadata information such as that contained in the TEI
Header, the Dublin Core model seems a more natural choice and is a highly developed and widely adopted
ontology. A mapping from TEI to DC has been tackled in stylesheets but does not appear in their main approach or considerations.
It is desirable (e.g. for SAWS) to be able to mark up triple-like relations directly
in
TEI, particularly if those relations are specific to the subject domain of the original
text
and/or if the relations indicate semantic information which cannot currently be encoded
using
TEI markup. The <relation>
element has recently been recommended by the TEI for encoding RDF relations in a TEI document,
representing the Subject-Predicate-Object triple format through the following attributes
of
<relation>
: @active
, @ref
and @passive
respectively. This has increased the
expressiveness of standard TEI markup without requiring changes within TEI. Further,
RDF can
be included directly in TEI markup, allowing researchers to use the workflow and tools
they
are already accustomed to rather than introducing a requirement for new tools to be
learnt and
used, external to the existing workflow. This is of particular benefit for users of
TEI who do
not have a strong technical background.
Automatic extraction of information from TEI documents
Much information can be extracted from the markup already in a TEI document, particularly
metadata and document structure. This ensures that markup work already invested in
texts can
be extracted from the text and represented in alternative forms that are more amenable
to
querying and automated reasoning. For example, in SAWS, there is an interest in how
the
structure and ordering of wise sayings changes as they are copied from one manuscript
to
another.
Acknowledging the size of the TEI tagset and the associated practical difficulties
in
mapping, we take the minimal subset of TEI needed to encode a document in TEI markup,
TEI-Bare. Work done with this schema serves as a basis for further extensions, for
example to
TEI-Lite, identified as “the most widely used TEI customization”. The Dublin Core Metadata Initiative forms the base model for the mappings from TEI.
The comparison of TEI and RDF is an oddly emotional topic. The strength of RDF lies
in its
apparent simplicity, and in its interoperability. RDF data is discoverable, and reusable.
An
OAC annotation for instance may have any number of targets of differing types. TEI
allows for
extremely granular expression with a context; RDF may often not require context to
be
meaningful.
Deceptively simple SPO assertions can be combined to tell complex stories. The following
annotation is relatively terse, but conveys much information, all of it easily discoverable
using either SOLR or SPARQL. There is considerable metadata surrounding the individual
annotation indicating what standards were employed, how it was encoded, the creation
date etc.
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:dms="http://dms.stanford.edu/ns/" xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:ore="http://www.openarchives.org/ore/terms/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:oac="http://www.openannotation.org/ns/">
<rdf:Description
rdf:about="http://example.com/Development/fedora/repository/ilives:112490/AnnotationList">
<rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#nil"/>
<rdf:type rdf:resource="http://dms.stanford.edu/ns/AnnotationList"/>
<rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#List"/>
<rdf:type rdf:resource="http://www.openarchives.org/ore/terms/Aggregation"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.com/Development/fedora/repository/ilives:112490/Canvas">
<rdf:type rdf:resource="http://dms.stanford.edu/ns/Canvas"/>
</rdf:Description>
<rdf:Description
rdf:about="http://example.com/Development/emic/serve/ilives:112490/AnnotationList/AnnotationList.xml">
<ore:describes
rdf:resource="http://example.com/Development/fedora/repository/ilives:112490/AnnotationList"/>
<rdf:type rdf:resource=""/>
<dc:format>application/rdf+xml</dc:format>
<dcterms:modified>2012-07-18T15:52:12-03:00</dcterms:modified>
</rdf:Description>
<rdf:Description rdf:about="urn:uuid:C5501895-BEA0-0001-DDE4-99D25F82B940">
<rdf:type rdf:resource="http://www.openannotation.org/ns/Annotation"/>
<oac:hasBody rdf:resource="urn:uuid:C5501895-BEE0-0001-DE69-3CF318B030D0"/>
<oac:hasTarget rdf:resource="urn:uuid:C5501895-BEF0-0001-3047-EF003BF91846"/>
<dcterms:created>2012-07-18 18:52:12 UTC</dcterms:created>
<dc:title>New Annotation</dc:title>
</rdf:Description>
<rdf:Description xmlns:cnt="http://www.w3.org/2008/content#"
rdf:about="urn:uuid:C5501895-BEE0-0001-DE69-3CF318B030D0">
<rdf:type rdf:resource="http://www.w3.org/2008/content#ContentAsText"/>
<cnt:chars>Sample text for demo</cnt:chars>
<cnt:characterEncoding>utf-8</cnt:characterEncoding>
</rdf:Description>
<rdf:Description rdf:about="urn:uuid:C5501895-BEF0-0001-3047-EF003BF91846">
<rdf:type rdf:resource="http://www.openannotation.org/ns/ConstrainedTarget"/>
<oac:constrains
rdf:resource="http://example.com/Development/fedora/repository/ilives:112490/Canvas"/>
<oac:constrainedBy rdf:resource="urn:uuid:C5501895-BEF0-0001-BD39-1FD48820108C"/>
</rdf:Description>
<rdf:Description xmlns:cnt="http://www.w3.org/2008/content#"
rdf:about="urn:uuid:C5501895-BEF0-0001-BD39-1FD48820108C">
<rdf:type rdf:resource="http://www.openannotation.org/ns/SvgConstraint"/>
<rdf:type rdf:resource="http://www.w3.org/2008/content#ContentAsText"/>
<cnt:chars><svg:rect xmlns:svg='http://www.w3.org/2000/svg' x='283.5' y='615.5'
width='377' height='108' r='0' rx='0' ry='0' fill='#ffffff' stroke='#000000'
style='opacity: 0.7; stroke-width: 2;' opacity='0.7' stroke-width='2'
></svg:rect></cnt:chars>
<cnt:characterEncoding>utf-8</cnt:characterEncoding>
</rdf:Description>
</rdf:RDF>
An advantage of OAC style encoding is that embedded tags are not necessary for the
designation of a target. A target may be defined as either svg coordinates as in the
example
below, or starting and stopping at two line/character points. These points may be
inside
tagsets allowing us to mimic overlapping tags without breaking xml validation. In
this example
the rdf targets a body of text beginning with the 6th character, and being 11 characters
long,
and ties this back to an authority record.
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:w="http://cwrctc.artsrn.ualberta.ca/#">
<rdf:Description rdf:ID="ent_1">
<w:id type="offset">ent_1</w:id>
<w:parent type="offset">struct_02</w:parent>
<w:offset type="offset">6</w:offset>
<w:length type="offset">11</w:length>
<w:type type="props">place</w:type>
<w:content type="props">Jean Golfin</w:content>
<w:term type="info">Golfin, Jean</w:term>
<w:viafid type="info">2498498</w:viafid>
<w:ptbnp type="info">243840</w:ptbnp>
<w:bnf type="info">12092537</w:bnf>
<w:lc type="info">n85202716</w:lc>
<w:certainty type="info">definite</w:certainty>
</rdf:Description>
</rdf:RDF>
By moving our structural TEI encoding, still very valuable in its native form, to
OAC/RDF
equivalents, we expose relationships based either on the physical textual coordinates,
x/y
coordinates, or structural location.
Use case implementation: illustrating the SAWS usage of TEI and RDF
The requirements for the SAWS project have been described above; namely that we need
to
insert links between sections of text within and between documents (some of which
exist in
digital form, and some of which do not), and to make scholarly assertions in a systematic
way
about the nature of these often complex relationships between sections of text.
First of all, therefore, we must define the basic unit of interest (a ‘section’ or
‘segment’ of text), i.e. the saying (or part of the saying). The SAWS TEI schema,
designed at
King’s College London for the encoding of gnomologia, uses the <seg>
element to mark up
this unit of intellectual interest, such as a saying (statement) together with its
surrounding
story (narrative). For example:
Alexander, asked whom he loved more, Philip or Aristotle, said:
“Both equally, for one gave me the gift of life, the other taught me to live the virtuous
life”.
This contains both a statement and a narrative:
<seg type="contentItem">
<seg type="narrative">
Alexander, asked whom he loved more, Philip or Aristotle, said:
</seg>
<seg type="statement">
Both equally, for one gave me the gift of life, the other taught
me to live the virtuous life.
</seg>
</seg>
Each of these <seg>
elements is given an @xml:id
to
provide a unique identifier (which is automatically generated using simple XSLT).
This
identifier differentiates one <seg>
from all other
examples of <seg>
, for instance <seg type="statement" xml:id="K.al-Haraka_ci_s1">
, where K.al-Haraka_ci_s1
is the unique identifier. In other words, it allows each intellectually interesting
unit (as
identified by our team’s scholars) to be distinguished from each other unit, thus
providing
the means of referring to a specific, often very brief, section of the text.
Secondly, we must have a systematic way of defining the relationship between one section
of text and another. Using a systematic method is important for two reasons: to ensure
consistency in the descriptive terms that we use across the SAWS project, and to develop
a
shared vocabulary between SAWS and other projects to which we want to make links (and
which
want to link their data to ours). We have therefore taken every possible opportunity
to
explore with other manuscript scholars the terms they need to use to describe the
relationships that they can observe within, and between, their texts. Relationships
identified
include terms such as isCloseRenderingOf, isLooseTranslationOf, isVerbatimOf, and
a variety of
other terms that represent in an agreed form the different ways in which sections
of text are
connected to one another.
We are representing these relationships using an ontology that extends the FRBR-oo
model (the harmonisation of the FRBR model of bibliographic records and the CIDOC Conceptual Reference Model (CIDOC-CRM)). The SAWS ontology, developed through collaboration between domain experts and technical
observers, models the classes and links in the SAWS manuscripts. Basing the SAWS ontology
around FRBR-oo provides most vocabulary for both the bibliographic (FRBR) and cultural
heritage (CIDOC) aspects being modelled. Using this underlying ontology as a basis,
relationships between (or within) manuscripts can be added to the TEI documents using
RDF markup.
To include RDF triples in TEI documents, three entities have to be represented for
each
triple: the subject being linked from, the object being linked to, and a description
of the
link between them. The subject and object entities in the RDF triple are represented
by the
@xml:id
that has been given to each of the TEI sections of
interest. We use the TEI element <relation/>
(recently
added to TEI) to place RDF markup in the SAWS documents, with four attributes as
follows:
-
The value of @active
is the @xml:id
of the subject being linked from;
-
The value of @passive
is the @xml:id
or URI of the object being linked to;
-
The value of @ref
is the description of the
relationship, which is drawn directly from the list of relationships in the
ontology;
-
The value of @resp
is the name or identifier of a
particular individual or resource (such as a bibliographic reference). Many of the
links
being highlighted are subjectively identified and are a matter of expert opinion,
so it is
important to record the identity of the person(s) responsible.
For example:
<seg type="statement"
xml:id="K._al-Haraka_ci_s5">
برهان ثالث كل محرّك لذاته فهو راجع على ذاته
</seg>
<seg type="contentItem"
xml:id="Proclus_ET_Prop.17_ci1">
Πᾶν τὸ ἑαυτὸ κινοῦν πρώτως πρὸς ἑαυτό ἐστιν ἐπιστρεπτικόν.
</seg>
<relation
active="http://www.ancientwisdoms.ac.uk/mss/K._al-Haraka#ci_s5"
ref="http://purl.org/saws/ontology#isCloseRenderingOf"
passive="http://www.ancientwisdoms.ac.uk/mss/Proclus_ET_Prop.17#ci1"
resp="http://purl.org/saws/people#wakelnig" />
This is equivalent to stating that the Arabic segment with the xml:id "ci_s5” in the
K._al-Haraka document is a close rendering of the Greek segment identified as “ci1”
in
Proclus_ET_Prop.17, and that this relationship has been asserted by Elvira Wakelnig.
The
definition of ‘isCloseRenderingOf’ has been agreed upon and
documented within the ontology, and the schema has been populated from the ontology
so that a
drop-down menu appears in the XML editor, from which the required value of @ref
can be selected. The <relation/>
element can be placed anywhere within the TEI document, or
indeed in a separate document if required: for our own purposes we have found it useful
to
place it immediately after the closing tag of the <seg>
identified as the “active” entity.
Some of the content of our texts could also be enhanced by being viewed in context
by
including information external to the XML document. For this purpose, the SAWS project
will
also use Linked Data principles to mark up our texts with semantic links to collections
of
data on the ancient world, such as the Pleiades historical gazetteer of ancient places and the Pelagios collection of ancient data interlinked through Pleiades references, and the Prosopography of the Byzantine World, which aims to document all the individuals mentioned in textual Byzantine sources
from the seventh to thirteenth centuries. We also plan to mark up links to existing
relevant
documents such as those stored in the Perseus Digital Library (which holds editions of some of the texts we identify as source texts for the
gnomologia).
Examples of transformations from TEI to RDF for the SAWS use case
Taking the SAWS use case as an example, the TEI version of the Kitāb al-Ḥaraka (“Book
of
Happiness”) held at Ankara Üniversitesi contains the following TEI-Bare-compliant
information
in its TEI header:
<teiHeader>
<fileDesc>
<titleStmt>
<title>Hacı Mahmud Efendi 5683</title>
</titleStmt>
<publicationStmt>
<publisher>Sharing Ancient Wisdoms</publisher>
</publicationStmt>
<sourceDesc>
<msDesc>
<msContents>
<msItem>
<author>(Pseudo-)Aristotle</author>
<title>Kitab al-Haraka</title>
</msItem>
</msContents>
</msDesc>
</sourceDesc>
</fileDesc>
</teiHeader>
Applying the XSLT generates the following Dublin Core triples:
<rdf:Description rdf:about="http://www.ancientwisdoms.ac.uk/mss/HacıMahmud5683#">
<dct:title>Hacı Mahmud Efendi 5683</dct:title>
<dct:creator>(Pseudo-)Aristotle</dct:creator>
<dct:type>TEI/XML</dct:type>
<dct:conformsTo>http://www.tei-c.org/ns/1.0</dct:conformsTo>
</rdf:Description>
As an example of structural triples, take SAWS’ TEI version of the Corpus Parisinum
manuscript as stored in the Digby collection in the Bodleian library, Oxford, UK,
in which a
<div xml:id="Aristippus01">
section is contained by its parent, <div
xml:id="Part01">
. From this we can derive the following two triples:
<rdf:Description rdf:about="http://www.ancientwisdoms.ac.uk/mss/Cod_Bodl_Dig_6#Aristippus01">
<dct:isPartOf>http://www.ancientwisdoms.ac.uk/mss/Cod_Bodl_Dig_6#Part01</dct:isPartOf>
</rdf:Description>
<rdf:Description rdf:about="http://www.ancientwisdoms.ac.uk/mss/Cod_Bodl_Dig_6#Part01">
<dct:hasPart>http://www.ancientwisdoms.ac.uk/mss/Cod_Bodl_Dig_6#Aristippus01</dct:hasPart>
</rdf:Description>
Resulting benefits for information exploration and retrieval in the SAWS project
We now have the capacity to extract many triples from our TEI document. The TEI-Bare
XSLT allows us to extract RDF triples representing information about the document
structure and metadata about the markup, as encoded in the TEI markup. This XSLT can
also now
be simply extended to extract more semantics, by transforming the triples encoded
through the
<relation>
element into RDF/XML syntax.
Once information is available in RDF format, it can be queried and reasoned with.
Critically, queries can be constructed based around the semantics encoded in the triples. The distribution of knowledge across Linked Data means that logical inferences can
be made to derive new knowledge from the facts, and also from the external data sources
that
have been referenced by the RDF triples.
The ability to traverse links between sets of data and discover related information
serendipitously is one of the major benefits of adopting linked data for the SAWS
project. For
the scholars working in SAWS, the study of the links between and within documents
is a central
part of the academic research underpinning this project. Extra assistance in finding relevant information can help discover sources of
interest that might otherwise have been missed, as many potential sources are geographically
scattered, occasionally hard and/or time-consuming to access and may also be completely
unknown outside of a handful of scholars. As an example, the Perseus Digital Library
holds a
collection of Classics-related documents which collectively contain over 68 million
words, as
well as an Arabic collection containing over 5 million words, and other collections. Navigating such quantities of potential research material to find content of
interest is one of the challenges faced by Classics researchers. Digitisation and
cataloguing
of the sources through projects like Perseus has been an important step in facilitating
this
research, and is being enhanced further by semantic navigation such as that undertaken
in the
SAWS project.
To illustrate ways in which linked data specifically assists scholars in the use case
of
SAWS, we look at how the scholars can discover information in new ways, draw from
a broader
set of sources and compile evidence for their research. If, say, a researcher is looking
at
how a particular place of interest is described across different manuscripts, information
in
the Pleiades historical gazetteer can be consulted when constructing queries. Researchers
can
ask to see, for example, all texts that refer to that particular geographical location,
even
if the texts use different place names to refer to that geographical location (as
it was often
the case that places were referred to by different names in different historical periods).
For
SAWS, this helps with the added complication of manuscripts in different languages,
with
different character sets (compare for example Ancient Greek, Arabic). This is possible
through
examining the place names mentioned in the SAWS manuscripts in the context of the
information
in the Pleiades ontology, which gives a precise geographical reference for each place.
For example the place “Aphrodisias” (URI
http://pleiades.stoa.org/places/638753) was known by the names:
-
Ninoe (in the Classical period),
-
Aphrodeisias (Hellenistic-republican, Roman periods),
-
Lelegon polis (unspecified period),
-
Stauropolis (Late-antique period)
-
Aphrodisias (Roman, Late-antique periods).
In Ancient Greek it is referred to as Ἀφροδισιάς (or Νινόη, Ἀφροδεισιάς, Λελέγων πόλις,
Σταυρόπολις, respectively).
Developing this example, we can disambiguate between Aphrodisias located in modern-day
Turkey and the Aphrodisias located by modern-day Spain (URI
http://pleiades.stoa.org/places/255978/), which the textual information alone
would not allow us to distinguish.
Returning to the issues of the SAWS manuscripts being written in various languages
(Ancient Greek and Arabic being the two main languages, and some related documents
in Spanish,
Latin, and English, to date): Although the TEI documents contain transcriptions of
manuscripts
in the original language, the use of RDF and linking allows the manuscript information
to
transcend language boundaries to some extent, as parts of the text can be linked to
resources
which are more language-neutral (e.g. the person “Aristotle” can be represented by
the URI
http://dbpedia.org/resource/Aristotle independently of whether they are
referred to as Aristotle, Ἀριστοτέλης, أرسطو , Aristoteles, Aristóteles or other alternative
forms in the original document). This is particularly helpful in studying the transmission
of
information in the manuscripts across languages, especially if the researcher does
not have
sufficient language skills to navigate between the different languages.
Evaluation of the SAWS implementation
To evaluate the usefulness of this work, researchers on the SAWS project are currently
encoding RDF information into existing TEI versions of manuscripts they are interested
in.
Having discussed what research questions they would like to explore, a demonstration
of the
TEI publications and the enhancements possible with the RDF information occurred in
a workshop
in June 2012. This highlighted several positive benefits, in particular increasing motivation of
actually seeing how the manuscripts could be navigated in this format, both through
exploring
the TEI digital edition and through seeing the tangible benefits of a semantically
enhanced
approach.
The demo also prompted useful constructive feedback, leading to further relation types
being identified for the SAWS model. This demo also prompted some interesting scholarly
debates following the identification of different interpretations of the notion of
translation
(which would not necessarily have been noticed and acted upon, had the scholars not
been
required to collaboratively formalise their tacit knowledge). Following this demo,
ongoing
further consultation with manuscript scholars has provided, and will continue to provide,
formative evaluative feedback for further developments.
Future work
With a basic TEI to RDF mapping in place, and using an easily extensible transformation
mechanism such as XSLT, this is a firm basis for future development of mappings by
both
ourselves and others, to include more of the TEI tagset. More generally, the choice
of TEI
tags being included will be dictated by individual needs (for example, SAWS uses a
specific
customisation of the TEI schema, as mentioned above, so is concentrating on tags used
in that
schema). In particular, we are discussing with collaborators how FRBR-oo can be used
to
enhance the base ontological model for the TEI to RDF mapping, for a richer vocabulary
which
includes more detailed semantics than Dublin Core (given that Dublin Core concentrates
on
modelling metadata and basic structures). We hope to discuss this work with members
of the
Special Interest Group on TEI and ontologies and make contributions to this group’s
work.
Upon determining our mappings, obtaining the data becomes a matter of simple extraction.
The RDF in our example makes direct connections - A is a child of B. Having information
available in RDF is useful not only for what can be done directly with RDF, but for
the
possible transformations from RDF to other data representations. One of this paper’s
authors
is working with the image-based manuscript annotation environment Shared Canvas, which makes use of Open Annotation Collaboration (OAC) syntax for annotations. An OAC annotation maps neatly to an RDF triple, where an
active/subject item has an annotation with a body of x (e.g. isCloseTranslationOf)
and a1
target of y (e.g. xml:id=GV132874897).)
OAC-RDF mappings are more complex, but more meaningful. Once our basic mappings are
in
place, we can spin off (or at least establish the framework for) more complex expressions.
Relationships can build on relationships, attaching creators (with foaf tags) to annotations,
which tie bodies of text (further identified by their character encoding) to the target
being
described. There is no real depth limit. The data is all there to be explored, and
the
framework exists to add many layers of metadata.
The Islandora is an open source project to allow users to manage a Fedora Repository
through PHP using a Drupal front end. Fedora Repositories are particularly adept at
maintaining and versioning the metadata that accompanies scholarly objects. The Digital
Humanities project is sponsored by EMiC to develop a suite of application for the
management
and critical analysis of Canadian modernism. One of the authors of this paper is the
lead
programmer in both these projects, so will be able to incorporate these transformations
into
the workflow to expose the data publicly. Of particular interest to our team is the
ability to
extract data from the TEI stream to build and maintain authority lists.
We therefore have several possible avenues of work to explore in this area. Future
development will both require, and foster, collaboration amongst those who are pursuing
the
question of what can be gained from the enhancement of TEI-encoded documents. It is
envisaged
that the outcomes of this research will be applicable across a wide variety of texts,
and it
is hoped that this paper will stimulate interest in new areas of future research into
combining different types of markup.
Acknowledgements
This work partly results from collaborative development between two of the paper authors
initiated at the Interedition 9th bootcamp, Leuven, Belgium, 2012, funded through
COST action
IS0704. The SAWS project is funded by HERA as project 09-HERA-JRP-CD-FP-152 and we
acknowledge
the benefits of this fruitful collaboration with our project partners. In preparing
the final
version of this paper we were assisted by the feedback from several anonymous
reviewers.
References
W. Caxton, The Dictes and Wise Sayings of the Philosophers (originally published London, 1477), reprinted 1877 (Elliot Stock, London)
A. Dekhtyar and I. E. Iacob. A framework for management of concurrent XML markup.
Data & Knowledge Engineering 52(2):185-208, 2005.
M. Doerr, “The CIDOC CRM - an Ontological Approach to Semantic Interoperability of
Metadata”, AI Magazine, Vol. 24, No. 3 (2003)
M. Doerr, and P. LeBoeuf, “Modelling Intellectual Processes: The FRBR – CRM
Harmonization” Digital Libraries: Research and Development, Vol. 4877, pp. 114-123. Springer
(2007). doi:https://doi.org/10.1007/978-3-540-77088-6_11.
Ø. Eide, A. Felicetti, C. Ore, A. D'Andrea, and J. Holmen. Encoding Cultural
Heritage Information for the Semantic Web. In EPOCH Conference on Open Digital Cultural
Heritage Systems, Rome, Italy, 2008.
Hedges, Mark; Jordanous, Anna; Dunn, Stuart; Roueche, Charlotte; Kuster, Marc W.;
Selig, Thomas; Bittorf, Michael; Artes, Waldemar; "New models for collaborative textual
scholarship,", Proceedings of the 6th IEEE International Conference
on Digital Ecosystems Technologies (DEST), Campione d’Italia, Italy.
2012.
H. V. Jagadish, L. V. S. Lakshmanan, M. Scannapieco, D. Srivastava, and N.
Wiwatwattana. Colorful XML: One Hierarchy Isn't Enough. In Proceedings of ACM SIGMOD International Conference on
Management of Data, volume 1, pages 251-262. ACM Press, 2004. doi:https://doi.org/10.1145/1007568.1007598.
M. O. Jewell. Semantic Screenplays: Preparing TEI for Linked Data. In Proceedings
of Digital Humanities, London, UK, 2010.
A. Jordanous, K. F. Lawrence, M. Hedges, and C. Tupman. Exploring
manuscripts: sharing ancient wisdoms across the semantic web. In Proceedings of the 2nd International Conference on Web Intelligence, Mining and
Semantics (WIMS '12), Craiova, Romania. 2012.
K. F. Lawrence. Wherefore Art Thou? - Crowdsourcing Linked Data from Shakespeare to
Dr Who. In Proceedings of Web Science, Koblenz, Germany, 2011.
Christian-Emil Ore and Øyvind Eide. TEI and cultural heritage ontologies: Exchange
of information? Literary and Linguistic Computing 24(2): 161-172, 2009. doi:https://doi.org/10.1093/llc/fqp010.
S. Peroni and F. Vitali. Annotations with EARMARK for arbitrary, overlapping and
out-of order markup. In Proceedings of the 9th ACM symposium on Document engineering,
pages
171-180, Munich, Germany, 2009. doi:https://doi.org/10.1145/1600193.1600232.
E. Pierazzo. A rationale of digital documentary editions. Literary and Linguistic
Computing, 26(4):463-477, 2011. doi:https://doi.org/10.1093/llc/fqr033.
P. Portier, N. Chatti, S. Calabretto, E. Egyed-Zsigmond, and J. Pinon. Modeling,
encoding and querying multi-structured documents. Information Processing & Management.
Forthcoming.
M. Richard, “Florilèges grecs”, Dictionnaire de
Spiritualité V (1962), cols. 475-512
F. Rodríguez Adrados, Greek wisdom literature and the Middle
Ages: the lost Greek models and their Arabic and Castilian Translations (2001),
English translation by Joyce Greer (2009), pp. 91-97 on Greek models; D. Gutas, “Classical
Arabic Wisdom Literature: Nature and Scope”, Journal of the American
Oriental Society, Vol. 101, No. 1, Oriental Wisdom (Jan. -Mar., 1981), pp.
49-86
Solomon, J. (ed)., Accessing antiquity: The computerization of classical studies. Tucson: University of Arizona Press. 1993.
Sanderson, R. Albritton, B. Schwemmer, R. Van de Sompel, H. "SharedCanvas: A
Collaborative Model for Medieval Manuscript Layout Dissemination". Proceedings of
the 11th ACM/IEEE
Joint Conference on Digital Libraries, Ottawa, Canada, June 2011.
B. Tillett, “What is FRBR? A Conceptual Model for the Bibliographic Universe”,
Library of Congress Cataloging Distribution Service, Library of Congress, Vol. 25,
pp.1-8
(2004)
G. Tummarello, C. Morbidoni, and E. Pierazzo. Toward textual encoding based on RDF.
In Proceedings of the 9th International Conference on Electronic Publishing (ELPUB 2005), Kath.
Univ. Leuven, June, pages 57-63. 2005.
Tupman, Charlotte; Hedges, Mark; Jordanous, Anna; Lawrence, Faith; Roueche, Charlotte;
Wakelnig, Elvira; Dunn, Stuart. Sharing Ancient Wisdoms: developing structures for
tracking cultural dynamics by linking moral and philosophical anthologies with their
source and recipient texts. In Proceedings of Digital
Humanities (DH2012), Hamburg, Germany. 2012.
×W. Caxton, The Dictes and Wise Sayings of the Philosophers (originally published London, 1477), reprinted 1877 (Elliot Stock, London)
×A. Dekhtyar and I. E. Iacob. A framework for management of concurrent XML markup.
Data & Knowledge Engineering 52(2):185-208, 2005.
×M. Doerr, “The CIDOC CRM - an Ontological Approach to Semantic Interoperability of
Metadata”, AI Magazine, Vol. 24, No. 3 (2003)
×M. Doerr, and P. LeBoeuf, “Modelling Intellectual Processes: The FRBR – CRM
Harmonization” Digital Libraries: Research and Development, Vol. 4877, pp. 114-123. Springer
(2007). doi:https://doi.org/10.1007/978-3-540-77088-6_11.
×Ø. Eide, A. Felicetti, C. Ore, A. D'Andrea, and J. Holmen. Encoding Cultural
Heritage Information for the Semantic Web. In EPOCH Conference on Open Digital Cultural
Heritage Systems, Rome, Italy, 2008.
×Hedges, Mark; Jordanous, Anna; Dunn, Stuart; Roueche, Charlotte; Kuster, Marc W.;
Selig, Thomas; Bittorf, Michael; Artes, Waldemar; "New models for collaborative textual
scholarship,", Proceedings of the 6th IEEE International Conference
on Digital Ecosystems Technologies (DEST), Campione d’Italia, Italy.
2012.
×H. V. Jagadish, L. V. S. Lakshmanan, M. Scannapieco, D. Srivastava, and N.
Wiwatwattana. Colorful XML: One Hierarchy Isn't Enough. In Proceedings of ACM SIGMOD International Conference on
Management of Data, volume 1, pages 251-262. ACM Press, 2004. doi:https://doi.org/10.1145/1007568.1007598.
×M. O. Jewell. Semantic Screenplays: Preparing TEI for Linked Data. In Proceedings
of Digital Humanities, London, UK, 2010.
×A. Jordanous, K. F. Lawrence, M. Hedges, and C. Tupman. Exploring
manuscripts: sharing ancient wisdoms across the semantic web. In Proceedings of the 2nd International Conference on Web Intelligence, Mining and
Semantics (WIMS '12), Craiova, Romania. 2012.
×K. F. Lawrence. Wherefore Art Thou? - Crowdsourcing Linked Data from Shakespeare to
Dr Who. In Proceedings of Web Science, Koblenz, Germany, 2011.
×Christian-Emil Ore and Øyvind Eide. TEI and cultural heritage ontologies: Exchange
of information? Literary and Linguistic Computing 24(2): 161-172, 2009. doi:https://doi.org/10.1093/llc/fqp010.
×S. Peroni and F. Vitali. Annotations with EARMARK for arbitrary, overlapping and
out-of order markup. In Proceedings of the 9th ACM symposium on Document engineering,
pages
171-180, Munich, Germany, 2009. doi:https://doi.org/10.1145/1600193.1600232.
×P. Portier, N. Chatti, S. Calabretto, E. Egyed-Zsigmond, and J. Pinon. Modeling,
encoding and querying multi-structured documents. Information Processing & Management.
Forthcoming.
×M. Richard, “Florilèges grecs”, Dictionnaire de
Spiritualité V (1962), cols. 475-512
×F. Rodríguez Adrados, Greek wisdom literature and the Middle
Ages: the lost Greek models and their Arabic and Castilian Translations (2001),
English translation by Joyce Greer (2009), pp. 91-97 on Greek models; D. Gutas, “Classical
Arabic Wisdom Literature: Nature and Scope”, Journal of the American
Oriental Society, Vol. 101, No. 1, Oriental Wisdom (Jan. -Mar., 1981), pp.
49-86
×Solomon, J. (ed)., Accessing antiquity: The computerization of classical studies. Tucson: University of Arizona Press. 1993.
×Sanderson, R. Albritton, B. Schwemmer, R. Van de Sompel, H. "SharedCanvas: A
Collaborative Model for Medieval Manuscript Layout Dissemination". Proceedings of
the 11th ACM/IEEE
Joint Conference on Digital Libraries, Ottawa, Canada, June 2011.
×B. Tillett, “What is FRBR? A Conceptual Model for the Bibliographic Universe”,
Library of Congress Cataloging Distribution Service, Library of Congress, Vol. 25,
pp.1-8
(2004)
×G. Tummarello, C. Morbidoni, and E. Pierazzo. Toward textual encoding based on RDF.
In Proceedings of the 9th International Conference on Electronic Publishing (ELPUB 2005), Kath.
Univ. Leuven, June, pages 57-63. 2005.
×Tupman, Charlotte; Hedges, Mark; Jordanous, Anna; Lawrence, Faith; Roueche, Charlotte;
Wakelnig, Elvira; Dunn, Stuart. Sharing Ancient Wisdoms: developing structures for
tracking cultural dynamics by linking moral and philosophical anthologies with their
source and recipient texts. In Proceedings of Digital
Humanities (DH2012), Hamburg, Germany. 2012.