Introduction to the DITA Vocabulary
The DITA standard (Darwin Information Typing Architecture) was originally developed
within
IBM in the late 1990s as an XML application to support the authoring and production
of modular
documentation, especially documentation for IBM software and hardware products intended
primarily for web delivery. DITA builds on architectural ideas developed for IBM's
IBM ID Doc
document type, which had been developed in the early 1990s as an SGML replacement
for IBM's
GML-based BookMaster product. IBM donated DITA to OASIS Open in 2003 and DITA 1.0
was
published as an OASIS standard in 2005. The current version of DITA 1.3, published
in 2015.
The DITA Technical Committee is currently working on DITA 2.0.
DITA is used widely in a number of industries, including software, hardware, publishing,
and government.
DITA's driving requirements are:
-
Modularity: The ability to author atomic units of content that stand alone and that
can be reused in different contexts. DITA calls these modules "topics".
-
Reuse: The ability to reuse content at either the module (topic) level or at the
element level. Reuse can be within a single publication or across multiple
publications.
-
Interoperability: The ability for documents with different local document types and
different element types to be used together within the same publication and to be
processed with the same set of tools with a minimum of document-type-specific
code.
-
Hyperlinking: The ability to create rich hyperlinks within and across
modules.
Around 1989 a meeting was held among the major software vendors of the time, including
IBM, Digital Equipment, HP, Group Bull, and others, hosted by Fred Dalrymple and chaired
by
Eve Maler, with the goal of defining a common markup vocabulary to enable the interchange
and
interoperation of documentation among the various vendors. Eliot Kimber and Wayne
Wohler from
IBM attended. IBM's takeaway from the meeting, based on the initial analysis prepared
by Maler
and Jeanne El Andaloussi, was that there was a core set of elements common to all
documents:
titled divisions, paragraphs, lists, figures, tables, etc., but that every group used
different names for these elements.
Out of this meeting many of the attendees went on to develop DocBook through the Davenport
group. Wohler and Kimber formed the core members of the team at IBM that developed
IBM ID Doc,
along with Don Day, the founding Chair of the DITA Technical Committee, and Simcha
Gralla.
IBM was and continues to be a federation of many different divisions, acquired companies,
product groups, and so on, all of which have both common requirements and local requirements.
Our experience with BookMaster, which was a centrally defined and managed monolithic
application intended to be used across all groups within IBM, was that the monolithic
approach
did not work at scale and was unnecessarily restrictive. By the time we started developing
IBM
ID Doc, BookMaster reflected more than 600 element types across a number of distinct
information and publication types and was still growing. We needed something that
would
satisfy the common requirements and ensure consistency and interoperability of content
and
supporting tooling without limiting the ability of individual groups to quickly satisfy
their
local requirements.
The recognition of a universal base set of semantics coupled with HyTime's architectural
forms facility gave us the answer: as long as all elements ultimately map back to
one of the
base types and conform to the minimal content model and attribute requirements of
the base
types, interoperability and interchange would be assured, while still allowing different
groups to optimize the markup they use with minimal constraints. This idea then became
the
basis for IBM ID Doc.
BookMaster also had fairly sophisticated reuse and hyperlinking mechanisms, at least
for
the time, and those requirements were also supported in IBM ID Doc, updated to take
advantage
of SGML technology and HyTime's features for enabling linking, addressing, and re-use
in an
SGML context.
IBM ID Doc used the architectural forms mechanism from the ISO/IEC HyTime standard
to
define a layered architecture by which a core set of base element types could be formally
extended to define new element types that were processable in terms of their base
types. As an
SGML standard, HyTime required the use of SGML-specific features, such as SGML declarations,
features that were not retained in XML.
DITA replaced IBM ID Doc's HyTime-based architectural forms with a simpler mechanism
that
uses attributes to declare each element's type and relationship to its base types.
This is
DITA's @class attribute, which simply specifies the ancestry of a given element as
an ordered
sequence of tokens, one for each ancestor and one for the element type itself. The
syntax of
the DITA @class attribute was designed specifically to work with CSS attribute selectors,
in
particular, the "~=" (token) selector.
This formal declaration of ancestry means that every element can be understood and
processed in terms of any ancestor (or itself) by simple inspection of the @class
attribute
value, avoiding the need for more complex declaration mechanisms as used in HyTime
and IBM ID
Doc, at the cost of requiring every element to carry the @class attribute or for documents
to
be parsed with grammars that can supply default values for attributes.
One interesting side effect of this attribute-based declaration mechanism is that
DITA
documents do not require grammars to be processed, or even necessarily to be validated,
as all
the information needed to understand any conforming DITA document in terms of its
DITA-defined
semantics is explicit in the document itself. Any conforming DITA document can be
transformed
into a document where all the element types are base types defined in the DITA standard,
which
can then be validated against the standard DITA grammars, enabling validation of conformance
to at least the minimum requirements defined by the DITA standard.
There are several intended audiences for DITA customization:
-
People configuring their local DITA environment to reflect local requirements by
doing "configuration", for example, omitting DITA modules that are not needed. This
audience is not necessarily a dedicated DITA practitioner or document type
designer.
-
People configuring their local DITA environment to add additional constraints on top
of existing DITA document types, basically a continuation of item (1) ("constraints").
This requires more grammar facility but can be supported through interactive tools
as
the activity is fundamentally the process of either removing things you don't want,
changing repeating OR groups to sequences, or making optional elements or attributes
mandatory.
-
Specialists defining new structural types (maps and topics) or new mix-in modules
(domains), that provide new attributes or element types that are specializations of
existing types ("specialization"). This requires more traditional document type analysis
and implementation skills, although some simple types of specialization are quite
easy
and do not require any specialized skills beyond the ability to create simple grammar
modules.
The use of customization varies widely within the DITA user community: some organizations
refuse to do any customization, using the grammars as provided by the DITA Technical
Committee, while others specialize almost every element type they need. Simple configuration
is fairly common, in part because it is required in practice as it is a prerequisite
for doing
any kind of configuration or customization. Specialization is less common, although
many DITA
users will do simple specializations such as defining new conditional attributes.
Modern
DITA-aware content management systems require some configuration and specialization
in order
to use CMS-specific attributes and elements, such as attributes to capture CMS-specific
object
IDs or identifiers.
The DITA standard defines two base types of document: maps and topics.
Topics are the atomic unit of content authoring and delivery. A topic must have a
title
and may have a body that contains content elements and may have nested topics, creating
a
titled hierarchy within a single topic document. Topics may also have descriptive
metadata.
Maps are collections of hyperlinks that serve to create some kind of publication
structure, such as a traditional book structure, a web site, or some other structure
for
whatever purpose. The links within a map may be to DITA topics or to any non-DITA
resource.
Maps can also define links among the resources linked to by the map (external links
in XLink
terminology but using a different syntactic approach).
Topics can be published in isolation but are usually combined with other topics in
the
context of maps.
With DITA version 1.3 RELAX NG is the grammar language used for the master DITA grammars,
with DTD and XSD versions generated automatically. However, most DITA users use DTD-based
grammars, both for historical and practical reasons. XSD grammars are less used but
are needed
by tools that only understand XSD, such as some XML editors.
Because DITA relies heavily on attributes with default values, use of RELAX NG for
DITA
requires implementation of the RELAX NG DTD compatibility specification, which until
recently
was not generally available. George Bina has implemented support for DTD compatibility
in
Java, making it generally available to Java-based tools, which is the vast majority
of DITA
processing implementations. Since then, the community has started to increase the
direct use
of RELAX NG for DITA documents, although it is still a tiny fraction of DTD-based
users.
The DITA Technical Committee implemented an RNG-to-DTD-and-XSD convertion tool for
generating DTD and XSD versions of all the TC-defined modules and document type shells.
This
tool is available through GitHubRNG2DTD.
Modularity and Customization
DITA defines a modular architecture for grammars, independent of the grammar technology
used. The RELAX NG schemas defined by the DITA standard are normative, with DTD and
XSD
versions generated from the RELAX NG grammars. All three grammar languages reflect
the same
modular architecture, although XSD 1.0 limitations on extension and override make
the XSD
implementation pattern slightly different from the RNG and DTD patterns, which are
as similar
as it is possible for them to be.
The DITA specification defines the following module types:
-
Structural modules define top-level types, either maps or topics. Map types
represent top-level document types because maps cannot be literally nested within
a
single document instance. Topic types represent either top-level document types or
subelements because topics can be literally nested within a single document
instance.
-
Domain modules define sets of element types that can be "mixed in" to structural
types to add new element types or attributes. The element and attribute types defined
in
domain modules are always specializations so they serve to extend the base grammar
such
that the domain-provided types are allowed anywhere their base types are allowed.
-
Constraint modules define constraints on the structural and domain types included
within a given DITA document type. Constraint modules can impose any constraint as
long
as the result is no less constrained than the base. For example, a constraint can
change
an OR group into a sequence or disallow optional elements but cannot allow elements
where they would not otherwise be allowed or make mandatory elements optional.
An essential aspect of the DITA architecture is that DITA grammar modules are invariant
for a given version in time, meaning that every copy of a given module should be identical.
That is, one should never directly modify any DITA grammar
module. All customization is thus done indirectly through the customization facilities
defined
by the DITA specification. The invariance of modules is essential to making DITA interchange
and interoperation work. It also means that, in theory, documents need only name the
modules
they use—processors could dynamically construct the actual grammars needed to do validation,
not that any such tools have been developed to date.
The DITA standard also defines a set of grammar coding patterns that, while not normative,
are reflected in the grammar modules developed by the DITA technical committee and
by most
DITA practitioners. This tends to make the implementation details of DITA grammars
remarkably
consistent across the DITA community. It also enables automated tools that can work
with DITA
grammars reliably.
DITA modules are "integrated" in the context of document type "shells" that serve
to
combine a set of either map or topic modules with zero or more domain modules and
zero or more
constraint modules. Map and topic types may not be combined within the same document
type as
map documents may not literally contain topics.
The DITA standard defines the concept of a "DITA document type", which is simply a
unique
set of modules.
Two documents that use the same set of modules by definition have the same DITA document
type, irrespective of the actual grammar files, if any, used to validate documents.
DITA document elements use an attribute, @domains, to declare the modules used (or
allowed
or expected to be used) with the document. Thus any two DITA documents can be compared
to
determine if they do or do not reflect the same DITA document type. This makes them
completely
independent of the use of any particular grammar file.
DITA customization involves three basic types of modification to the base declarations:
-
For any element type or the attributes @base and @props, allowing specializations
of
that type to occur wherever the base type is allowed (domain extension)
-
For any element type, allowing constraint of its content model
Within content models, every element type is represented by an extensible or over-ridable
component: named pattern (RNG), parameter entity (DTD), name group (XSD). Individual
attributes are not extensible so there is no need to represent them using extensible
components.
Content models and attribute lists are defined using over-ridable components, making
it
easy to override them in order to impose constraints (or as easy as it can be for
XSD 1.0,
which is not always very easy due to limitations in the XSD redefine feature).
In addition to general content model configuration, each topic type provides an
over-ridable component for defining the set of topic types that may be literally nested
within
the topic, if any. Each topic type module defines a default value for this component
(typically just allowing the topic type to nest itself, if nesting is allowed at all)
and then
document type shells may override this configuration as needed.
Domain Integration
A key aspect of DITA customization is "integrating" domains.
Domain modules provide new element types that are specializations of base types (and
that are not themselves map or topic or any specialization of map or topic).
Domain elements are "mixed in" such that anywhere a given domain-provided element's
base
is allowed the domain-provided element is allowed. This makes integration easy but
means
that domain elements can occur anywhere that the base is allowed, which may not always
be
what is desired. In this case it is possible to use constraints to limit where
domain-provided elements can occur.
For example, consider a domain "dbParaDomain" that defines a specialization <para>
of
the base element type <p> (paragraph). When the domain is integrated into a DITA document
type shell, the element type <para> will be available wherever <p> is allowed.
In DTD syntax this is done by overriding the declaration of the parameter for the
<p>
element to also include <para> in the document type
shell:
<!-- Document type shell -->
...
<!-- Inclusion of base element type parameter entity declarations -->
<!ENTITY ... SYSTEM ...>
%...;
<!-- Inclusion of dbParaDomain parameter entity declarations -->
<!ENTITY ... SYSTEM ...>
%...;
...
<!ENTITY % p "p | %dbPara-d-p; >
...
<!-- Inclusion of base element type element type declarations -->
<!ENTITY ... SYSTEM ...>
%...;
<!-- Inclusion of dbParaDomain element type declarations -->
<!ENTITY ... SYSTEM ...>
%...;
<!-- End of document type shell -->
Where the parameter entity %dbPara-d-p is declared
as:
<!ENTITY % dbPara-d-p
"para"
>
(The name "dbPara-d-p" is read as "specializations of <p> provided by the dbPara
domain".)
Within a content model, any reference to "%p;" now expands to "p |
para":
<!ENTITY % body.content
"(%p; |
%fig; |
%table; |
%section;)*
"
>
If the desire on the part of the document type shell author is to allow <para> but
not <p>, that can be done in the shell by simply omitting "p |" from the declaration
of
the %p parameter
entity:
<!-- Only allow <para>: -->
<!ENTITY % p "%dbPara-d-p; >
Now references to %p will expand to "para", not "p | para".
This omission of <p> in the shell is technically a constraint but the DITA standard
does not require a separate module file for it.
RELAX NG Configuration
RELAX NG makes combining DITA grammar modules about as easy as it can be. Unfortunately,
because DITA also uses DTDs and it must be possible to generate those DTDs from the
RELAX NG
grammars, DITA RNG grammars defined by the DITA Technical Committee cannot use RNG
features
that are not available in DTDs, such as <notAllowed> patterns or context-specific
patterns.
However, DITA RNG grammars can take advantage of an important RELAX NG feature, the
ability for one pattern to unilaterally extend another pattern. This allows DITA domain
modules to be "self integrating". It is this feature of RELAX NG that motivated the
Technical Committee to make RNG the master grammar language for DITA from which DTD
and XSD
versions are generated. Self-integrating domains make setting up new DITA document
type
shells about as easy as it can be for an otherwise unaided human.
Each element type has a corresponding pattern name for the element that includes the
element type
itself:
<define name="p">
<ref name="p.element"/>
</define>
Domain modules define patterns that include all the element types in the domain that
are
specializations of a given base
element:
<define name="dbPara-d-p">
<choice>
<ref name="p"/>
</choice>
</define>
The domain can then extend the element type pattern using the domain-defined element
choice
pattern:
<define name="p" combine="choice">
<ref name="dbPara-d-p"/>
</define>
Which has the effect of making the effective value of the "p"
pattern:
<define name="p">
<choice>
<ref name="p"/>
<ref name="para"/>
</choice>
</define>
If the desire is to omit the base element but allow specializations, then the base
type's pattern must be redefined in the document type
shell:
<grammar ...>
...
<div>
<include href="topicMod.rng">
<define name="topic-info-types">
<ref name="topic.element"/>
</define>
<define name="p">
<!– No p allowed, only specializations -->
</define>
</include>
...
</div>
...
</grammar>
RELAX NG document type shells are just sets of references to modules plus any
constraints that can or should be defined in the shell, rather than in separate modules.
RNG
shells must also provide special declarations for attributes of type ID due to a quirk
in
the RELAX NG design.
Because domain modules are self integrating, there is no need for separate domain
integration patterns as there is for DTDs.
In addition, RELAX NG only requires a single file for each module, while DTDs require
two files for each structural and element domain module, one for parameter entities
and one
for the element type and attribute declarations. Attribute domains only require a
single
file in DTD syntax and in RNG.
Map type grammars only involve the inclusion of domain modules and constraints because
maps cannot nest the way topics can.
Topic modules also allow configuration of the allowed topic nesting for each topic
type
integrated into the document type
shell:
<grammar ...>
...
<div>
<include href="topicMod.rng">
<define name="topic-info-types">
<ref name="topic.element"/>
</define>
<define name="p">
<!– No p allowed, only specializations -->
</define>
</include>
...
</div>
...
</grammar>
Here the shell simply allows the topic type "topic" to nest itself. If the shell
included other topic types it could allow those to be nested as well.
Each topic type provides its own topic-type-specific topic nesting pattern, allowing
different topic types within the same shell to have different nesting rules.
This is the one place in DITA where a document type shell can make the document type
less constrained rather than more constrained. However, it makes sense because maps,
via
hyperlinks, can create arbitrary hierarchies of topics of any type, so allowing topics
to
literally nest within a single XML document is really more of a convenience for authoring
or
storage and any constraint on topic nesting imposed by a shell is not (directly) enforceable
for topics combined using maps.
Constraints that are not done directly in the document type shell are done by replacing
a reference to a module with a reference to the constraint module that then redefines
patterns in the original module in its reference to the original
module:
<grammar ...>
<!– Shell for the constrained task topic type -->
...
<div>
<a:documentation>CONTENT CONSTRAINT INTEGRATION</a:documentation>
<include href="strictTaskbodyConstraintMod.rng">
<define name="task-info-types">
<ref name="task.element"/>
</define>
</include>
</div>
...
</grammar>
Where strictTaskbodyConstraintMod.rng
is:
<grammar ...>
<div>
<a:documentation>CONTENT MODEL OVERRIDES</a:documentation>
<include href="taskMod.rng">
<define name="taskbody.content">
<optional>
<ref name="prereq"/>
</optional>
<optional>
<ref name="context"/>
</optional>
<!– section omitted -->
<optional>
<choice>
<ref name="steps"/>
<ref name="steps-unordered"/>
<!– steps-informal omitted -->
</choice>
</optional>
<optional>
<ref name="result"/>
</optional>
<optional>
<ref name="tasktroubleshooting"/>
</optional>
<optional>
<ref name="example"/>
</optional>
<optional>
<ref name="postreq"/>
</optional>
</define>
</include>
</div>
</grammar>
The constraint module includes the base module being constrained, in this case the
TC-defined taskMod.rng, and redefines any patterns defined within the referenced module
(or
any modules it references). This is an example of constraining an element's content
model by
overriding the element's content model pattern.
The base declaration for taskbody.content
is:
<define name="taskbody.content">
<zeroOrMore>
<choice>
<ref name="prereq"/>
<ref name="context"/>
<ref name="section"/>
</choice>
</zeroOrMore>
<optional>
<choice>
<ref name="steps"/>
<ref name="steps-unordered"/>
<ref name="steps-informal"/>
</choice>
</optional>
<optional>
<ref name="result"/>
</optional>
<optional dita:since="1.3">
<ref name="tasktroubleshooting"/>
</optional>
<zeroOrMore>
<ref name="example"/>
</zeroOrMore>
<zeroOrMore>
<ref name="postreq"/>
</zeroOrMore>
</define>
Comparing the two versions of the taskbody.content pattern, you can see that the
constrained version omits <section> and <steps-informal> and replaces the initial
repeating OR group with a sequence.
DTD Syntax Customization
DTD customization is similar to RNG customization structurally but has to account
for
the limitation in DTDs that parameter entities must be declared before they can be
referenced and the first declaration of a given parameter entity name wins.
This means that, except for attribute domains, all modules require two files, one
for
parameter entities and one for element type and attribute list declarations.
In addition, domain element integration must be done in document type shells, as shown
above.
Constraint modules have the additional challenge that they must declare every parameter
entity referenced by the parameter entities the constraint module overrides, which
can make
for a lot of cutting and pasting (the RNG-to-DTD conversion tool automates this cutting
and
pasting for a number of constraint patterns).
Otherwise, the customization pattern is conceptually the same as for RNG:
-
Document type shells include the structural and domain modules that make up the
document type, as well as any constraint modules.
-
Every element type has a corresponding parameter entity used for domain
integration.
-
Every element type has corresponding %*.content and %*.attlist parameter entities
that can be overridden to constrain the content model or attribute list of that
element type.
XSD Syntax Customization
XSD customization is complicated by the need to use the XSD 1.0 redefine facility,
which
allows redefinition of groups in a way that is conceptually similar to RNG pattern
redefinition.
However, the XSD 1.0 redefine feature presents a couple of challenges:
-
The feature is defined ambiguously such that different processors can implement it
in incompatible ways, only one of which works for DITA, which happens to be the way
that the Apache Xerces parser implements it.
-
The requirement for "particle preservation" in redefined models.
The particle preservation requirement is defined as follows: The definitions within
the
<redefine> element itself are restricted to be redefinitions of components from the
redefined schema document, in terms of themselves. That is,
-
Type definitions must use themselves as their base type definition;
-
Attribute group definitions and model group definitions must be supersets or
subsets of their original definitions, either by including exactly one reference to
themselves or by containing only (possibly restricted) components which appear in
a
corresponding way in their redefined selves.
This means that any redefinition of a model must reflect each of the
particles in the original model. For choice groups this is not a problem: any choice
group
is a valid restriction, including an empty group. But for sequence groups it is a
serious
problem, in that you cannot simply omit items from the sequence as part of the
redefinition.
This requires a workaround where you refactor the original sequence into a sequence
of
named groups that then allow redefinition.
XSD 1.1 includes a new feature, override, that allows for direct specification of
the
kinds of constraints DITA needs. Unfortunately, the XSD 1.1 specification is not widely
implemented so the DITA standard cannot use it for TC-defined XSD grammars.
For DITA 2.0 the TC has decided to not provide modular XSD versions of the TC-defined
modules, although it may provide non-modular versions as a convenience. Non-modular
XSDs are
XSD schemas that do not use redefine, including any constraints in place of the original
base declarations, avoiding the need for redefines or overrides. It should be relatively
straightforward to generate a single-file XSD version of any RNG document type shell.
See Kimber1 for details.
Interchange and Interoperability
For DITA, interchange and interoperability apply to the following areas:
-
Interchange and interoperability of documents
-
Interchange and interoperability of working grammars
-
Interchange and interoperability of processing
-
Interchange and interoperation of knowledge
Interchange and Interoperability of Documents
DITA maximizes interchange and interoperability of documents by ensuring that any
conforming DITA document can be processed in at least a minimal but correct way by
any
general-purpose DITA processor. DITA's hyperlink-based approach for combining individual
topics into complete publications allows any DITA document to be used with any other
DITA
documents.
DITA provides two primary forms of re-use:
-
Use of topics by reference from maps
-
Use of individual elements within topics or maps by reference.
Reuse of topics from maps is not inherently constrained, meaning any DITA topic can
be
used from any map. Maps can be designed to impose constraints on the kinds of topics
allowed
by a particular kind of reference and, through specialization of the hyperlinking
elements
in a map, specific structural rules can be imposed, but the vocabulary details of
topics do
not impose any constraints on how topics may be used from maps.
Reuse of individual elements is constrained such that a given element can only re-use
an
element of the same type or a more specialized type. This rule ensures that the effective
document resulting from the reuse is still valid with respect to the document type
of the
using document. Compare with XInclude, which allows any element of any type to be
used in
any context where the grammar allows xi:include to occur.
The DITA standard as originally defined imposed more strict constraints on element-level
reuse, requiring that the DITA document types of the two documents involved be "compatible"
such that the document type of the element being reused was not less constrained than
the
document type of the document making the reuse reference. The intent was to ensure
that
constraints imposed on the using document were not circumventable by the reuse.
In practice, this constraint has been rarely enforced by tools or desired by user
communities. It leads to annoying limitations, for example, being unable to reuse
elements
in more-general topic types from more-specialized types where the reuse would otherwise
be
fine in the context of the local content rules.
In DITA 1.3 the constraint requirement was relaxed so that unconstrained reuse is
now
the default behavior.
Interchange and Interoperability of Grammars
DITA's modular approach to grammar organization allows grammar modules to be
interchanged reliably because the defining modules are never modified (every copy
of a given
version in time of a module should be identical). The coding patterns and extension
mechanisms used in the DITA grammar files allow DITA modules to be used together with
a
minimum of effort.
In the context of a DITA-aware tool like OxygenXML, using new DITA grammars is as
simple
as deploying the grammar-providing plugins to the DITA Open Toolkit used by Oxygen.
Those
document types can then immediately be used to create new documents, edit documents
that use
those document types with full DITA functionality automatically available (because
Oxygen's
configuration is specialization aware and thus can be applied to any DITA document
without
further configuration effort), and apply DITA processing to those documents.
Interchange and Interoperability of Processing
Because specialization-aware processors can handle any DITA document in at least a
minimal way, processing is inherently interchangeable at that level. The DITA standard
also
defines requirements for invariant processing where processing must be consistent
to ensure
interoperability and consistency of results, for example address resolution and
use-by-reference resolution. It also provides processing suggestions for elements
that most
users would expect to be processed or rendered a certain way.
Beyond that, the modular nature of DITA grammars maps naturally to modular software
approaches, such as plugin-based frameworks. Where such software exists, such as DITA
Open
Toolkit and OxygenXML, processing for new specializations can usually be added by
providing
software modules that simply extend the base processing to handle the specializations
as
needed.
In addition, because all DITA elements can be processed in terms of their base,
specializations that do not require any special processing do not require configuration
or
processing support simply to account for a new element type or attribute.
For example, having defined a new specialization module and packaged it as an Open
Toolkit plugin, simply by deploying the grammar-providing plugin to the Open Toolkit
used by
OxygenXML, OxygenXML immediately enables visual editing of the new specialization
simply by
providing fallback processing to all the specializations. If the new specialization
requires
some special configuration, such as unique styling, that can be added by defining
a new
OxygenXML document type framework that is an extension of the built-in DITA framework,
re-using all the existing style sheets and only requiring new styles for the new
specializations where the base styling is not what you want.
Interchange and Interoperability of Knowledge
The coding patterns for DITA grammar modules and document type shells defined in the
DITA standard mean that the knowledge of how to use, configure, and
customize DITA grammars is reusable and interoperable. That is, any person who understands
the DITA coding patterns should be able to immediately understand and use the document
type
shells, specialization modules, and constraint modules developed by any other DITA-aware
person. These coding patterns also enable automatic and interactive tools that make
it
easier to work with or generate DITA grammars. For example, Jang Graat has implemented
an
interactive tool for defining new constraint modules that then generates the RNG for
the
constraint from which DTD or (with limits) the XSD version can be generated.
Finally, the organizational patterns for DITA grammars end up providing a general
pattern for how DITA grammars are packaged with entity resolution catalogs for use
with
tools, as implemented by the open-source DITA Open Toolkit. DITA document type shells
and
grammars can be packaged into Open Toolkit plugins which Open Toolkit can then automatically
combine with other document-type-providing plugins in the context of a single master
entity
resolution catalog. Because Open Toolkit is both cross-platform and open-source, anyone
or
any tool can use it, effectively providing a de-facto standard for packaging and use
of DITA
grammars.
DITA Customization How To
This section demonstrate how to:
-
Remove an element
-
Add a new inline element
-
Add a new block element
-
Constrain an attribute value or the data type of an element
-
Constrain the content model of a block element
-
Define a new top-level document type
Remove An Element
Removing an element in DITA means disallowing the element from being used in any
context. In DITA terms this is a constraint. The details of how the constraint is
implemented depend on whether or not the disallowed element is a base element and
if it is,
has associated domain-provided specializations. For specialized elements the details
of the
constraint depend on whether or not the element is defined in a domain or in a topic
or map
type.
For domain-provided specializations, disallowing the element means omitting it from
the
domain-defined domain integration pattern or parameter entity.
For example, the DITA "highlight" domain provides the <b> and <i> elements. You
want to disallow these two elements (but allow other elements from the domain, such
as
<u> and <line-through>).
For RELAX NG you override the domain-defined pattern that adds the domain-provided
elements to the element-type-name pattern for the base type (<ph> in this
case):
<grammar xmlns="http://relaxng.org/ns/structure/1.0">
<div>
<a:documentation>INCLUDE MODULES</a:documentation>
<include href="urn:oasis:names:tc:dita:rng:topicMod.rng">
<define name="topic-info-types">
<ref name="topic.element"/>
</define>
</include>
...
<include href="urn:oasis:names:tc:dita:rng:highlightDomain.rng">
<define name="hi-d-ph">
<choice>
<!-- Omit b and I:
<ref name="b.element"/>
<ref name="i.element"/>
-->
<ref name="line-through.element" dita:since="1.3"/>
<ref name="overline.element" dita:since="1.3"/>
<ref name="sup.element"/>
<ref name="sub.element"/>
<ref name="tt.element"/>
<ref name="u.element"/>
</choice>
</define>
</include>
...
</div>
</grammar>
Within the highlightDomain module this declaration adds the domain-contributed
specializations of <ph> to the base <ph> element-type-name
pattern:
<define name="ph" combine="choice">
<ref name="hi-d-ph"/>
</define>
The redefinition of the "hi-d-ph" pattern has the effect of removing <b> and <i>
from all content models that would have otherwise reflected them because they refer
to the
"ph" pattern.
For DTDs, the same constraint is implemented by simply replacing the reference to
%hi-d-ph; in the domain-integration parameter entity with the list of element types
from the
highlight domain to be
included:
<!-- ============================================================= -->
<!-- DOMAIN EXTENSIONS -->
<!-- ============================================================= -->
<!-- Omit b and i: -->
<!ENTITY % ph "ph |
line-through |
sup |
sub |
tt |
u
">
To disallow a base element type for which there are domain-provided specializations,
then it's simply a matter of removing the element type from the element-type-name
pattern
(RNG) or domain integration parameter entity (DTD).
For RNG you can override the element-type-name pattern to use <notAllowed>:
<grammar xmlns="http://relaxng.org/ns/structure/1.0">
...
<div>
<a:documentation>INCLUDE MODULES</a:documentation>
<include href="../../base/rng/topicMod.rng">
<define name="p">
<notAllowed/>
</define>
<define name="topic-info-types">
<ref name="topic.element"/>
</define>
</include>
<include href="dbParaDomainMod.rng"/>
...
</div>
</grammar>
For DTD you simply omit the element type from the domain integration parameter
entity:
<!-- ============================================================= -->
<!-- DOMAIN EXTENSIONS -->
<!-- ============================================================= -->
<!-- Omit p: -->
<!ENTITY % p "%dbPara-d-p;"
>
If the base element to be disallowed does not have any domain-provided specializations
then for DTDs you cannot simply set the domain integration parameter entity to ""
because
that will result in invalid content models anywhere the parameter entity is
referenced.
Thus, for DTDs you must override the declaration of any parameter entity that references
the element's domain integration parameter entity to omit the reference to it and
the
connectors associated with it. Fortunately, this can be done automatically when generating
the DTD modules from the RELAX NG modules.
To disallow elements defined in map or topic modules, you simply override the content
model patterns or parameter entities that include the element to be disallowed.
For example, to disallow the base element <section> from generic topics, you would
define a constraint module like
so:
<grammar xmlns="http://relaxng.org/ns/structure/1.0">
<a:documentation>Constraint on generic topic to disallow use of sections within body</a:documentation>
<include href="urn:oasis:names:tc:dita:rng:topicMod.rng">
<define name="body.content">
<zeroOrMore>
<choice>
<ref name="body.cnt"/>
<ref name="bodydiv"/>
<ref name="example"/>
<!–- Disallow section
<ref name="section"/>
-->
</choice>
</zeroOrMore>
</define>
</include>
</grammar>
In a document type shell the constraint module is referenced in place of the reference
to the constrained
module:
<grammar xmlns="http://relaxng.org/ns/structure/1.0" ...>
<div>
<a:documentation>INCLUDE MODULES</a:documentation>
<include href="topicBodyNoSectionConstraintMod.rng">
<define name="topic-info-types">
<ref name="topic.element"/>
</define>
</include>
...
</div>
...
</grammar>
The DTD equivalent uses a contraint module that overrides the declaration of
%body.content; to omit
section:
<!-- Constraint to disallow section within body -->
<!ENTITY % body.content
"(%body.cnt; |
%bodydiv; |
%example;)*"
>
This constraint module is then included in the document type shell before the reference
to the base topic.mod
file:
...
<!-- ============================================================= -->
<!-- CONTENT CONSTRAINT INTEGRATION -->
<!-- ============================================================= -->
<!ENTITY % topicBodyNoSection SYSTEM "topicBodyNoSectionConstraint.mod"
>%topicBodyNoSection;
<!-- ============================================================= -->
<!-- TOPIC ELEMENT INTEGRATION -->
<!-- ============================================================= -->
<!ENTITY % topic-type
PUBLIC "-//OASIS//ELEMENTS DITA 1.3 Topic//EN"
"../../base/dtd/topic.mod"
>%topic-type;
...
Add a New Inline or Block Element
In DITA adding a new element that is not itself a new topic or map type and is not
specific to a new topic or map type means defining a new domain module that provides
the
element type. The domain is then integrated into document type shells to make the
element
available wherever its base element is allowed. Constraints can be used to allow the
element
only in specific contexts or to disallow it from specific contexts.
Using the "DocBook paragraph" domain as an example, the RELAX NG domain module would
be:
<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
xmlns:dita="http://dita.oasis-open.org/architecture/2005/"
xmlns="http://relaxng.org/ns/structure/1.0">
<moduleDesc xmlns="http://dita.oasis-open.org/architecture/2005/">
<moduleTitle>DocBook para Domain</moduleTitle>
<headerComment>
Provides a specialization of <p>, <para>, mirroring
the DocBook element type for paragraphs.
</headerComment>
<moduleMetadata>
<moduleType>elementdomain</moduleType>
<moduleShortName>dbPara-d</moduleShortName>
<modulePublicIds>
<dtdMod>urn:pubid:dtd:elements:dbParaDomain</dtdMod>
<dtdEnt>urn:pubid:dtd:entities:dbParaDomain</dtdEnt>
<xsdMod>urn:pubid:xsd:dbParaDomain</xsdMod>
<rncMod>urn:pubid:rnc:dbParaDomain</rncMod>
<rngMod>urn:pubid:rng:dbParaDomain</rngMod>
</modulePublicIds>
<domainsContribution>(topic dbPara-d)</domainsContribution>
</moduleMetadata>
</moduleDesc>
<div>
<a:documentation>DOMAIN EXTENSION PATTERNS</a:documentation>
<define name="dbPara-d-p">
<choice>
<ref name="para.element"/>
</choice>
</define>
<define name="p" combine="choice">
<ref name="dbPara-d-p"/>
</define>
</div>
<div>
<a:documentation>ELEMENT TYPE NAME PATTERNS</a:documentation>
<define name="para">
<ref name="para.element"/>
</define>
</div>
<div>
<a:documentation>ELEMENT TYPE DECLARATIONS</a:documentation>
<div>
<a:documentation>LONG NAME: Para</a:documentation>
<define name="para.content">
<zeroOrMore>
<ref name="para.cnt"/>
</zeroOrMore>
</define>
<define name="para.attributes">
<ref name="univ-atts"/>
<optional>
<attribute name="outputclass"/>
</optional>
</define>
<define name="para.element">
<element name="para" dita:longName="Paragraph">
<a:documentation>DocBook-style paragraph</a:documentation>
<ref name="para.attlist"/>
<ref name="para.content"/>
</element>
</define>
<define name="para.attlist" combine="interleave">
<ref name="para.attributes"/>
</define>
</div>
</div>
<div>
<a:documentation>SPECIALIZATION ATTRIBUTE DECLARATIONS</a:documentation>
<define name="para.attlist" combine="interleave">
<ref name="global-atts"/>
<optional>
<attribute name="class" a:defaultValue="+ topic/p dbPara-d/para "/>
</optional>
</define>
</div>
</grammar>
The domain module is then simply included into any document type shell that wants
to
allow
it:
<grammar xmlns="http://relaxng.org/ns/structure/1.0"
xmlns:dita="http://dita.oasis-open.org/architecture/2005/"
xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
xmlns:svg="http://www.w3.org/2000/svg">
...
<div>
<a:documentation>INCLUDE MODULES</a:documentation>
<include href="urn:oasis:names:tc:dita:rng:topicMod.rng">
<define name="topic-info-types">
<ref name="topic.element"/>
</define>
</include>
<include href="dbParaDomainMod.rng"/>
...
</div>
</grammar>
Constrain an Attribute Value or Element Data Type
Because DITA is limited to DTD features in the TC-defined grammars, the DITA standard
does not define any element data types. If you are using RNG or XSD as your working
grammar
syntax you could of course add element data type constraints by adding a constraint
module
that uses RNG lexical patterns or XSD data type constraints.
For attributes, constraining a value is a matter of overriding the declaration of
the
attribute in a constraint module.
The DITA grammar coding conventions do not provide general parameterization of
individual attribute declarations, so constraining an individual attribute requires
overriding the pattern or parameter entity that provides the attribute declaration.
If the attribute is a common attribute used by multiple element types with the same
base
definition it will normally be in a pattern with related attributes, for example,
the
"display-atts"
pattern:
<div>
<a:documentation>COMMON ATTRIBUTE SETS</a:documentation>
<define name="display-atts">
<optional>
<attribute name="scale">
<choice>
<value>50</value>
<value>60</value>
<value>70</value>
<value>80</value>
<value>90</value>
<value>100</value>
<value>110</value>
<value>120</value>
<value>140</value>
<value>160</value>
<value>180</value>
<value>200</value>
<value>-dita-use-conref-target</value>
</choice>
</attribute>
</optional>
<optional>
<attribute name="frame">
<choice>
<value>all</value>
<value>bottom</value>
<value>none</value>
<value>sides</value>
<value>top</value>
<value>topbot</value>
<value>-dita-use-conref-target</value>
</choice>
</attribute>
</optional>
<optional>
<attribute name="expanse">
<choice>
<value>column</value>
<value>page</value>
<value>spread</value>
<value>textline</value>
<value>-dita-use-conref-target</value>
</choice>
</attribute>
</optional>
</define>
To constrain the @expanse attribute to just the values "column" and "page" you would
define a constraint module that has a copy of the display-atts pattern with the modified
definition of
@expanse:
<grammar xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
xmlns:dita="http://dita.oasis-open.org/architecture/2005/"
xmlns="http://relaxng.org/ns/structure/1.0">
<a:documentation>
Limits @expanse attribute to page and column
</a:documentation>
<include href="urn:oasis:names:tc:dita:rng:topicMod.rng">
<define name="display-atts">
<optional>
<attribute name="scale">
<choice>
<value>50</value>
<value>60</value>
<value>70</value>
<value>80</value>
<value>90</value>
<value>100</value>
<value>110</value>
<value>120</value>
<value>140</value>
<value>160</value>
<value>180</value>
<value>200</value>
<value>-dita-use-conref-target</value>
</choice>
</attribute>
</optional>
<optional>
<attribute name="frame">
<choice>
<value>all</value>
<value>bottom</value>
<value>none</value>
<value>sides</value>
<value>top</value>
<value>topbot</value>
<value>-dita-use-conref-target</value>
</choice>
</attribute>
</optional>
<optional>
<attribute name="expanse">
<choice>
<value>column</value>
<value>page</value>
<!-- Omit spread and textline -->
<value>-dita-use-conref-target</value>
</choice>
</attribute>
</optional>
</define>
</include>
</grammar>
If an attribute only occurs on a single element type or has a unique declaration for
a
given element type, then you would override the element type's *.attributes pattern.
For example, the @outputclass attribute is available on almost every element and is
declared as CDATA on all elements. To specify specific values for @outputclass on
say the
<keyword> element, you would redeclare the "keyword.attributes" pattern in a constraint
module:
<grammar xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
xmlns:dita="http://dita.oasis-open.org/architecture/2005/"
xmlns="http://relaxng.org/ns/structure/1.0">
<a:documentation>
Define specific values for @outputclass on keyword.
</a:documentation>
<include href="urn:oasis:names:tc:dita:rng:topicMod.rng">
<define name="keyword.attributes">
<optional>
<attribute name="keyref"/>
</optional>
<ref name="univ-atts"/>
<optional>
<attribute name="outputclass">
<choice>
<value>class1</value>
<value>class2</value>
<value>class3</value>
</choice>
</attribute>
</optional>
</define>
</include>
</grammar>
Constrain the Content Model of a Block Element
Every element type has a *.content pattern that defines the content model for that
element type. Thus constraining the content model for any element is a matter of redefining
the *.content pattern in a constraint module. The coding pattern is the same for all
element
types.
For example, the base definition of the <fig> content model
is:
<define name="fig.content">
<optional>
<ref name="title"/>
</optional>
<optional>
<ref name="desc"/>
</optional>
<zeroOrMore>
<choice>
<ref name="figgroup"/>
<ref name="fig.cnt"/>
</choice>
</zeroOrMore>
</define>
A constraint module that makes <title> and <desc> required
is:
<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
xmlns:dita="http://dita.oasis-open.org/architecture/2005/"
xmlns="http://relaxng.org/ns/structure/1.0">
<a:documentation>
Require title and desc for figure
</a:documentation>
<include href="urn:oasis:names:tc:dita:rng:topicMod.rng">
<define name="fig.content">
<!-- Require title and desc -->
<ref name="title"/>
<ref name="desc"/>
<zeroOrMore>
<choice>
<ref name="figgroup"/>
<ref name="fig.cnt"/>
</choice>
</zeroOrMore>
</define>
</include>
</grammar>
Define a New Top-Level Document Type
In DITA a new top-level document type can mean either a new DITA document type, meaning
a new combination of existing modules, or a new specialized map or topic type intended
to be
used as a root element.
Defining a new document type shell is a matter of creating references to the appropriate
modules and including any shell-defined constraints.
For a new map or topic type specialization, the minimum is a copy of the appropriate
base map or topic type's declaration module (RNG) or modules (DTD) with the base map
or
topic element type name changed to the specialized name. For example, to define a
new topic
type "chapter" that is otherwise identical to the base <topic> topic type, you would
simply copy the topicMod.rng file to a new file, e.g., chapterMod.rng, update all
declarations that refer to the element type "topic" to refer instead to the topic
type
"chapter", and remove the declarations of all other element
types:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="urn:oasis:names:tc:dita:rng:vocabularyModuleDesc.rng"
schematypens="http://relaxng.org/ns/structure/1.0"?>
<grammar xmlns:dita="http://dita.oasis-open.org/architecture/2005/"
xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
xmlns="http://relaxng.org/ns/structure/1.0"
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
<moduleDesc xmlns="http://dita.oasis-open.org/architecture/2005/">
<moduleTitle>Chapter Topic Type</moduleTitle>
<headerComment>Represents a chapter within a publication</headerComment>
<moduleMetadata>
<moduleType>topic</moduleType>
<moduleShortName>topic</moduleShortName>
<modulePublicIds>
<dtdEnt></dtdEnt>
<dtdMod></dtdMod>
<xsdMod></xsdMod>
<xsdGrp></xsdGrp>
<rncMod></rncMod>
<rngMod></rngMod>
</modulePublicIds>
</moduleMetadata>
</moduleDesc>
<div>
<a:documentation>ARCHITECTURE ATTRIBUTES</a:documentation>
<define name="arch-atts">
<optional>
<attribute name="dita:DITAArchVersion" a:defaultValue="1.3"/>
</optional>
</define>
</div>
<div>
<a:documentation>INFO TYPES PATTERNS</a:documentation>
<define name="chapter-info-types">
<ref name="info-types"/>
</define>
<define name="info-types">
<ref name="topic.element"/>
</define>
</div>
<div>
<a:documentation>ELEMENT TYPE NAME PATTERNS</a:documentation>
</div>
<div>
<a:documentation>ELEMENT TYPE DECLARATIONS</a:documentation>
<div>
<a:documentation>LONG NAME: Chapter</a:documentation>
<define name="chapter.content">
<ref name="title"/>
<optional>
<ref name="titlealts"/>
</optional>
<optional>
<choice>
<ref name="shortdesc"/>
<ref name="abstract"/>
</choice>
</optional>
<optional>
<ref name="prolog"/>
</optional>
<optional>
<ref name="body"/>
</optional>
<optional>
<ref name="related-links"/>
</optional>
<zeroOrMore>
<ref name="topic-info-types"/>
</zeroOrMore>
</define>
<define name="chapter.attributes">
<attribute name="id">
<data type="ID"/>
</attribute>
<ref name="conref-atts"/>
<ref name="select-atts"/>
<ref name="localization-atts"/>
<optional>
<attribute name="outputclass"/>
</optional>
</define>
<define name="chapter.element">
<element name="chapter" dita:longName="Chapter">
<a:documentation>The <chapter> element represents a chapter within a publication</a:documentation>
<ref name="chapter.attlist"/>
<ref name="chapter.content"/>
</element>
</define>
<define name="chapter.attlist" combine="interleave">
<ref name="chapter.attributes"/>
<ref name="arch-atts"/>
<ref name="domains-att"/>
</define>
<define name="idElements" combine="choice">
<ref name="chapter.element"/>
</define>
</div>
</div>
<div>
<a:documentation>SPECIALIZATION ATTRIBUTES</a:documentation>
<define name="chapter.attlist" combine="interleave">
<ref name="global-atts"/>
<optional>
<attribute name="class" a:defaultValue="+ topic/topic chapter/chapter "/>
</optional>
</define>
</div>
</grammar>
When defining a new top-level topic type you would normally also define at least one
document type shell for
it:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="urn:oasis:names:tc:dita:rng:checkShell.sch" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<?xml-model href="urn:oasis:names:tc:dita:rng:vocabularyModuleDesc.rng"
schematypens="http://relaxng.org/ns/structure/1.0"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:dita="http://dita.oasis-open.org/architecture/2005/" xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0">
<moduleDesc xmlns="http://dita.oasis-open.org/architecture/2005/">
<moduleTitle>Chapter Topic Type Shell</moduleTitle>
<headerComment xml:space="preserve">
Shell for chapter topics
</headerComment>
<moduleMetadata>
<moduleType>topicshell</moduleType>
<moduleShortName>chapter</moduleShortName>
<shellPublicIds>
<dtdShell>urn:pubid:example.org:dita:dtd<var presep=":" name="ditaver"/>:chapter.dtd</dtdShell>
<rncShell>urn:pubid:example.org:dita:rnc:chapter.rnc<var presep=":" name="ditaver"/></rncShell>
<rngShell>urn:pubid:example.org:dita:rng:chapter.rng<var presep=":" name="ditaver"/></rngShell>
<xsdShell>urn:pubid:example.org:dita:xsd:chapter.xsd<var presep=":" name="ditaver"/></xsdShell>
</shellPublicIds>
</moduleMetadata>
</moduleDesc>
<div>
<a:documentation>ROOT ELEMENT DECLARATION</a:documentation>
<start>
<ref name="chapter.element"/>
</start>
</div>
<div>
<a:documentation>DOMAINS ATTRIBUTE</a:documentation>
<define name="domains-att" combine="interleave">
<optional>
<attribute name="domains"
a:defaultValue="(topic abbrev-d)
(topic chapter)
(topic equation-d)
(topic hazard-d)
(topic hi-d)
(topic indexing-d)
(topic markup-d xml-d)
(topic markup-d)
(topic mathml-d)
(topic pr-d)
(topic relmgmt-d)
(topic svg-d)
(topic sw-d)
(topic ui-d)
(topic ut-d)
a(props deliveryTarget)"
/>
</optional>
</define>
</div>
<div>
<a:documentation>MODULE INCLUSIONS</a:documentation>
<include href="urn:oasis:names:tc:dita:rng:topicMod.rng"/>
<include href="chapterMod.rng"/>
<include href="urn:oasis:names:tc:dita:rng:abbreviateDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:deliveryTargetAttDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:equationDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:hazardDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:highlightDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:indexingDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:markupDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:mathmlDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:programmingDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:releaseManagementDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:svgDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:uiDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:utilitiesDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:xmlDomain.rng"/>
<include href="urn:oasis:names:tc:dita:rng:xnalDomain.rng"/>
</div>
<div>
<a:documentation>ID-DEFINING-ELEMENT OVERRIDES</a:documentation>
<define name="any">
<zeroOrMore>
<choice>
<ref name="idElements"/>
<element>
<anyName>
<except>
<name>chapter</name>
<name>topic</name>
<nsName ns="http://www.w3.org/2000/svg"/>
<nsName ns="http://www.w3.org/1998/Math/MathML"/>
</except>
</anyName>
<zeroOrMore>
<attribute>
<anyName/>
</attribute>
</zeroOrMore>
<ref name="any"/>
</element>
<text/>
</choice>
</zeroOrMore>
</define>
</div>
</grammar>
Creating this shell is largely an exercise in cut and paste.