Flynn, Peter. “Leveraging markup to process narrative recipes.” Presented at Balisage: The Markup Conference 2025, Washington, DC, August 4 - 8, 2025. In Proceedings of Balisage: The Markup Conference 2025. Balisage Series on Markup Technologies, vol. 30 (2025). https://doi.org/10.4242/BalisageVol30.Flynn01.
Balisage: The Markup Conference 2025 August 4 - 8, 2025
Balisage Paper: Leveraging markup to process narrative recipes
Peter Flynn
Peter Flynn managed the Academic and Collaborative
Technologies Group in IT Services at University College
Cork, Ireland until his retirement in 2018. He trained at
the London College of Printing and did his MA in
computerized planning systems at Central London Polytechnic
(now the University of Westminster). He worked in the UK for
the Printing and Publishing Industry Training Board as a DP
Manager and for United Information Services of Kansas as IT
consultant before joining UCC as Project Manager for
academic and research computing. In 1990 he installed
Ireland’s first Web server and concentrated on academic and
research publishing support. He has been Secretary of the
TeX Users Group, Deputy Director for Ireland of EARN, and a
member both of the IETF Working Group on HTML and of the W3C
XML SIG; and he has published books on HTML, SGML/XML, and LaTeX. Peter
also runs the markup and typesetting consultancy Silmaril,
and is editor of the XML FAQ as well as an irregular
contributor to conferences and journals in electronic
publishing, markup, and Humanities computing, and has been a
regular speaker and session chair at the XML Summer School
in Oxford. He completed a PhD in 2015 on User
Interfaces to Structured Documents with the Human
Factors Research Group in Applied Psychology in UCC. He
maintains a fairly random semi-technical blog at https://blogs.silmaril.ie/peter.
In earlier papers [Flynn 2020a, Flynn 2021] I described the implementation of
‘℞’
(pronounced /ˈɹɛs.ɪ.pē/ ‘reSEEpay’), an
XML/XSLT system for checking and reproducing structured
cookery recipes. Since then, work has been ongoing:
a) to refine the categories used in the metadata; b) to improve the implementation in CSS; and c) to extend the markup to narrative recipes encoded in other schemas.
This paper describes the third of these, with a test
implementation in TEI. The objective was to see to what extent
the encoding could be implemented for a narrative recipe,
while preserving the original [narrative] form of the text
encoded as-is in TEI, rather than force it into the modern
layout.
When is a narrative not a story? When it’s a recipe!
Writing recipes with distinct sections listing the ingredients
and preparation steps is a 19th century invention.
Historically, recipes were just narrative prose. Can we
leverage markup and style from those recipes to represent
narrative recipes in a modern style?
Note: Acknowledgements
My thanks go again to all my friends and colleagues in
kitchens, laboratories, and libraries, especially on BlueSky and
Discord, for their help and suggestions, with a special mention
to Beatrix Färber, Editor of the CELT Project; to culinary historian
Regina Sexton of the Department of
Adult Continuing Education, UCC; and to the late and
very much missed Michael Sperberg-McQueen, whose valuable
knowledge, especially of Middle German, was key to the
adaptation of the recipe discussed.
Structure and narrative
Modern recipes, in print as on the web, contain two major
components: a list of ingredients and a method. These were
discussed extensively in the earlier papers on this topic
[Flynn 2020a, Flynn 2021]. They exist in one
form or another in every layout, design, schema, and handwritten
note for modern recipes, and a recipe today would be considered
virtually useless without both of them.
Figure 1: Simple recipes with list of ingredients and method in the
modern style
<ol>
<li>Mix ingredients</li>
<li>Boil to 235°F</li>
<li>Stir to cool to 140°</li>
<li>Beat and pour into tin</li>
</ol>
Mix ingredients
Boil to 235°F
Stir to cool to 140°
Beat and pour into dish
Simplified recipe for fudge used as a markup example
(Ingredients and Method) adapted from [Flynn 2020b] with the
MS from a handwritten recipe book used as a source.
Before the expansion of the cookery book from niche
publication to mass market during the second half of the 19th
century, recipes were narrative [Scully 2002a, Rumminger 2018]. The Roman cookery
writer Apicius used narrative, probably because the form was
already ancient and the norm for descriptive writers, which
continued for many centuries. The arrival of printing just made
reproduction easier, so Marx Rumpolt, Hannah Glasse, and
Marie-Antoine Carême felt no need to change the tradition of
writing in narrative form [Figure 2]. With ingredients and method intertwined
in narrative, it could be argued that the text itself
is the method.
Figure 2: Narrative recipe examples with ingredients and method
intertwined
Three examples of narrative recipes: the Rumpolt recipe
for liver used in this paper, and two much later ones for
turkey.
[Left] Gebratene Leber (№.5)
from Marx Rumpolt’s Ein new
Kochbuch of 1581, p.76
[Rumpolt 1581];
[Center] A Turkey, &c. in
Jelly from Hannah Glasse’s The
Art Of Cookery of 1753, p.258
[Glasse 1753];
[Right] Galantine of Turkey in
Aspic, from Marie-Antoine Carême’s
Le cuisinier parisien of 1828,
p.145 [Carême 1828].
The early cookery authors like Rumpolt were writing for
professional cooks with large kitchens and kitchen staff in
wealthy houses: the cooks were assumed to be able to read, or
have access to someone who could, and who knew enough to be able
to decide on materials and quantities from experience. Like the
modern Répertoire de la Cuisine for
French cooking [Gringoire 1914], recipes
were terse because that was all that was needed.
Hannah Glasse was an exception: she explicitly
addressed the non-professional cook, and it is worth quoting her
intent:
If I have not wrote in the high polite ſtyle, I
hope ſhall be forgiven; for my intention is to inſtruct the
lower ſort and therefore muſt treat them in their own way.
For example: when I bid them lard a fowl, if I ſhould bid
them lard with large lardoons, they would not know what I
meant; but when I ſay they muſt lard with little pieces of
bacon, they know what I mean. So, in many other things in
Cookery, the great cooks have ſuch a high way of expreſſing
themſelves, that the poor girls are at a loſs to know what
they mean:[…][Glasse 1753]
This all changed rapidly after the middle of the 19th
century: Eliza Acton (1845) is usually credited with inventing
the separation of Ingredients from Method, and her lead was
followed by others including Isabella Beeton (1861) in England
and Fanny Farmer (1896) in the USA; although Mrs Beeton
notoriously gave ingredients but no quantities.
Implementation
An encounter with a German colleague with a copy of
Rumpolt’s recipes from the 16th century resulted in a decision
to try the ℞ techniques with narrative recipes. The recipe
marked No.5 in [Figure 2] from the book by
Rumpolt [Rumpolt 1581] was chosen because it
seemed to have the qualities needed to test the system: a
narrative for a simple dish, containing at least one mention of
all the ingredients, in order, with details of what to do with
each. No quantities are given apart from the bread, but they can
be deduced according to the number of people eating. The
original was scanned and a transcription composed and proofread.
OCR was attempted with tesseract but
was not successful; Transkribus was
slightly better on the first few lines, but failed to complete
the text.
The ℞ software was written with the modern structured recipe
in mind. The development implementation is done in the custom
recipe DTD used for decades for the author’s collection of
recipes, both for historical convenience and for practical
simplicity, but also to avoid the collection being placed in
jeopardy by well-intentioned but uncontrollable third-party
changes. To the best of the author’s knowledge there is no
publicly-accepted schema or DTD for recipes in widespread use:
many writers and publishers have invented their own, the present
author included [Maler 1996, Flynn 1998]. As explained in the earlier
papers, however, it was felt to be important to ensure that the
key metadata was held only in attributes, so that ℞
functionality could easily be added to any schema or DTD of a
user’s choosing.
The Text Encoding Initiative (TEI: [Sperberg-McQueen 2002b] and [Burnard 2007]) provides a large and adaptable suite
of markup intended for the encoding and preservation of
historical documents in the Humanities. For a user, researcher,
or publisher wanting to encode a historical recipe in XML, the
TEI is an obvious choice instead of a proprietary DTD or a
bespoke schema, as TEI has been in widespread use in the markup
of historical documents for decades, and also has publisher and
software support. For the testing phase, the XML encoding was
applied in TEI P5 [Figure 3] by writing a shim to the
tei_all.dtd, because the attribute
declarations could simply be copied and pasted from the author’s
development environment. The arguments for and against the use
of DTDs vs schemas have been rehearsed too
many times to need reproducing here, but the choice in this case
was purely pragmatic: simplicity of editing, speed of
development, and the fact that publishers still depend on
DTDs, not W3C or RNG schemas. The TEI-conformant practice is to
generate a custom schema using ODD [TEI Consortium 2004], and this would remain the target for a
stable interchange implementation (see [section “Results and conclusions”]), but the current development environment
is experimental and a proof-of-concept within the author’s
collection, and is still subject to the constraints of that
environment.
Figure 3: Recipe encoded in TEI (before ℞ markup) with translation
<!DOCTYPE list SYSTEM "/dtds/tei/p5/tei_all.dtd">
<list>
<label>Gebratene Leber.</label>
<item>
<list rend="Method">
<item>Nim̄ die Leber</item>
<item>vnnd quell ſie in heiſſem Waſſer</item>
<item>buꜩ ſie fein ſauber auß</item>
<item>vnd ſteck ſie an ein holꜩern Spieß</item>
<item>ſampt dem Ma‧gen</item>
<item>leg ſie auff ein Roßt</item>
<item>vnd brat ſie geſchwindt hinweg.</item>
</list>
<list rend="Serving">
<item>Wenn du ſie wilt anrichten</item>
<item>ſo nim̄ gebehte Schnitten drey oder vier
in die Schüſſel</item>
<item>geuß darein ein gute Hennenbrüh</item>
<item>mit Petterſilgen Wurꜩel
geſotten.</item>
</list>
<list rend="Comment">
<item>Und wenn du haſt angerichtet</item>
<item>ſo leg ſie auff die Brüh</item>
<item>ſo iſt es gut vnd zierlich.</item>
</list>
</item>
</list>
Grilled Liver.
Take the liver
and rinse it in hot water
clean it well out
and stick it on a wooden skewer
along with the stomach¹
lay it on the griddle
and quickly grill it off.
When you are ready to dress it
lay 3–4 slices of toasted bread on the plate
pour over a good chicken stock
steeped with root-parsley.
And when you have everything prepared
lay it [the liver] on the stock
so that it is good and dainty.
¹ Caul?
As explained above, the categorization and measurement
metadata required by ℞ is carried exclusively in attributes.
This means that the relevant adaptations can be applied to any
existing schema or DTD in a modification layer or shim without
the need to rewrite any of the element type structure. This
approach does, however, mean that one element type valid in
mixed content must be otherwise unused in the encoding of this
class of [recipe] document. Such an element type is therefore
available for ‘hijacking’ (use for
unintended purposes) to carry the ingredient metadata; and
possibly another element type to carry the method metadata. In
the case of narrative recipes, as noted earlier, the whole text
may be the method.
Alternatively, if no unused element types are available,
some additional attribute could be used to flag the instances
when ingredient or recipe metadata is being carried on an
element type in existing use. A new element type could be used
in a new namespace, but implementors (encoders) in the target
audience (publishers of recipe books) looking to test ℞ may not
have the degree of freedom or corporate authority needed to make
such modifications. If one or more of the ℞ attribute names
themselves are already in use on a candidate element type in a
target schema, some renaming in the code would be needed. A
future version should probably armour the names or use
parameterization or indirection. This is experimental software:
permanent use would require more fundamental changes to be
undertaken in formal procedures.
In the case of TEI, for the purposes of testing, the
material element type was chosen for the
ingredients. The attributes defined in ℞ were added via the shim
to the tei_all.dtd [Figure 5] using
the same enumerated attribute token list files of categories
used for the development system explained in [Flynn 2020a] ([Figure 5]). The TEI
Header is omitted in [Figure 4] for
brevity.
Figure 4: Recipe with ingredients marked in the
material element type
<!DOCTYPE list SYSTEM "recipe-tei.dtd">
<list>
<label n="5">Gebratene Leber.</label>
<item>
<list type="Method">
<item>Nim̄ die <material quantity="500" unit="g" meat="veal"
part="liver" xml:id="liver">Leber</material></item>
<item>vnnd quell ſie in <material quantity="1" unit="ℓ"
basic="water" treatment="hand-hot"
xml:id="water">heiſſem Waſſer</material></item>
<item>buꜩ ſie fein ſauber auß</item>
<item>vnd ſteck ſie an ein holꜩern Spieß</item>
<item>ſampt dem <material quantity="1" meat="veal" part="stomach"
xml:id="magen">Ma‧gen</material></item>
<item>leg ſie auff ein Roßt</item>
<item>vnd brat ſie geſchwindt hinweg.</item>
</list>
<list rend="Serving">
<item>Wenn du ſie wilt anrichten</item>
<item>ſo nim̄ <material quantity="3–4" unit="slice" quality="white"
basic="bread" treatment="warmed" xml:id="bread">gebehte
Schnitten drey oder vier</material> in die Schüſſel</item>
<item>geuß darein ein gute <material quantity="150" unit="ml"
meat="chicken" part="stock"
xml:id="stock">Hennenbrüh</material></item>
<item>mit <material quantity="25" unit="g"
vegetable="root-parsley" treatment="grated and cooked in
the stock" xml:id="parsley">Petterſilgen Wurꜩel</material>
geſotten.</item>
</list>
<list rend="Comment">
<item>Und wenn du haſt angerichtet</item>
<item>ſo leg ſie auff die Brüh</item>
<item>ſo iſt es gut vnd zierlich.</item>
</list>
</item>
</list>
The TEI Header has been omitted from this fragment for
space reasons.
The curly ‘ℓ’ (ell) is an
alternative to the normal lowercase ‘l’
for liters in because the normal
‘l’ may be mistaken for a
digit 1 in some fonts.
The resulting TEI document (with some site-associated
decoration) was enabled in the author’s recipe site. The
advantage of using XML in the same format as all the other
recipes ([Figure 5]) is that the process of
checking remains the same because the same attribute names are
used.
The markup of cooking equipment was originally considered in
recipes on the author’s site, but not pursued as the recipes
were intended for the experienced cook in a well-equipped
kitchen. However, as TEI explicitly caters for descriptive
markup, it would be possible to co-opt an additional element type
to mark equipment in a similar manner to ingredients.
Figure 5: The shim implementing the attributes for the ingredient
metadata, with an example list
<!ENTITY % tei SYSTEM "/dtds/tei/p5/tei_all.dtd">
<!ENTITY % alcohol-list SYSTEM "alcohol.list">
<!ENTITY % basic-list SYSTEM "basic.list">
<!ENTITY % beans-list SYSTEM "beans.list">
<!ENTITY % dairy-list SYSTEM "dairy.list">
<!ENTITY % form-list SYSTEM "form.list">
<!ENTITY % fruit-list SYSTEM "fruits.list">
<!ENTITY % herb-list SYSTEM "herbs.list">
<!ENTITY % meat-list SYSTEM "meat.list">
<!ENTITY % nature-list SYSTEM "nature.list">
<!ENTITY % nuts-list SYSTEM "nuts.list">
<!ENTITY % seeds-list SYSTEM "seeds.list">
<!ENTITY % part-list SYSTEM "parts.list">
<!ENTITY % pasta-list SYSTEM "pasta.list">
<!ENTITY % seafood-list SYSTEM "seafood.list">
<!ENTITY % sizes-list SYSTEM "sizes.list">
<!ENTITY % spice-list SYSTEM "spices.list">
<!ENTITY % sprinkles-list SYSTEM "sprinkles.list">
<!ENTITY % vegetable-list SYSTEM "vegetables.list">
<!ENTITY % langs-list SYSTEM "langs.LIST">
<!ENTITY % package-list SYSTEM "packages.LIST">
<!ENTITY % units-list SYSTEM "units.LIST">
The TEI DTD is declared as a parameter
entity for later use. The units, weights,
and ingredient categorization token list
files are declared as general entities
(right-hand column) and invoked for
each attribute (left-hand column).
The dummy termination OMIT allows each
list to be edited and sorted externally without
needing to be concerned about displacing the token delimiter (see example of seeds.list).
Finally, the TEI DTD
itself is invoked via its parameter entity.
<!ATTLIST material quantity CDATA #IMPLIED
unit (%units-list;OMIT) #IMPLIED
unit-weight CDATA #IMPLIED
container (%package-list;OMIT) #IMPLIED
size (%sizes-list;OMIT) #IMPLIED
colour CDATA #IMPLIED
quality CDATA #IMPLIED
nature (%nature-list;OMIT) #IMPLIED
meat (%meat-list;OMIT) #IMPLIED
seafood (%seafood-list;OMIT) #IMPLIED
part (%part-list;OMIT) #IMPLIED
dairy (%dairy-list;OMIT) #IMPLIED
fruit (%fruit-list;OMIT) #IMPLIED
alcohol (%alcohol-list;OMIT) #IMPLIED
herb (%herb-list;OMIT) #IMPLIED
vegetable (%vegetable-list;OMIT) #IMPLIED
bean (%beans-list;OMIT) #IMPLIED
nut (%nuts-list;OMIT) #IMPLIED
seed (%seeds-list;OMIT) #IMPLIED
pasta (%pasta-list;OMIT) #IMPLIED
spice (%spice-list;OMIT) #IMPLIED
basic (%basic-list;OMIT) #IMPLIED
sprinkles (%sprinkles-list;OMIT) #IMPLIED
form (%form-list;OMIT) #IMPLIED
prep CDATA #IMPLIED
treatment CDATA #IMPLIED
note (1|2|3) #IMPLIED
comment CDATA #IMPLIED
symbol CDATA #IMPLIED
alt CDATA #IMPLIED
calories CDATA #IMPLIED
status (optional|required) #IMPLIED>
%tei;
linseed|
pumpkin|
sesame|
sunflower|
tahini|
The document was transformed to the same HTML format as all
the other recipes in the author’s collection by adapting the
existing XSLT to handle the TEI element types. This consisted
largely of adding the TEI element type names as unions to the
existing XPaths in the match attributes of the
relevant XSL-templates. The resulting web page is online at
https://xml.silmaril.ie/recipes/grilled-liver.html
and the source (which also serves as the print version, using
CSS as in [Flynn 2021]) is at https://xml.silmaril.ie/recipes/grilled-liver.xml [Figure 6].
Figure 6: The recipe in TEI displayed in a recipe web site in
display format (L) and print format (R)
On a practical note, the reuse of the metadata from all
recipes is a key feature of most collections. Extraction or
direct reference enables the classification of recipes, and the
application of the ingredient details to translation and
indexing, as well as checking for allergens and other specifics.
None of these has been implemented here yet, but [Figure 7] illustrates an example ad
hoc extraction from the command line.
Figure 7: Metadata from the recipe, showing a list of
ingredients extracted for further use
Metadata extracted with the command lxgrep material liver.xml for use in checking, indexing, and other
analyses (lxgrep is part of the
LT-XML2 utilities from the Language Technology Group in
Edinburgh).
Results and conclusions
This part of the ℞ development has been another learning
curve:
Encoding
The process of encoding was relatively
straightforward, much of which is due to the accessibility
and adaptability of the TEI, its documentation, and the
copious provision of element types. There is always an
degree of chance in selecting element types for this kind
of temporary ‘abuse’, in that
someone, somewhere, will be depending absolutely on the
unmodified existence of the very one that you have chosen
to carry your metadata. In my experience, however, it is a
rare TEI document that, from the many modules available,
uses all 32,768 element types available in mixed content.
As mentioned in [section “Implementation”] above, a namespace
could be used to distinguish a new element type.
Implementation
The adaptation of the existing web site XSLT to
process a TEI document instead of the homebrew recipe
schema used for development was also unproblematic, as
both follow the traditional structured document concepts
of heading—metadata—sectioning found in other systems
[Flynn 2017].
Alternative approaches
The TEI Consortium provides ODD [TEI Consortium 2004], a language for modifying the TEI
schema (or indeed any schema), so it could be used to
create a specialist vocabulary for recipes which would
still conform to the TEI Guidelines. The current
implementation is an experiment, and pretends to no such
conformity. The ODD method should be used for a formal
publication project, for example one encoding large
numbers of recipes such as the entire Rumpolt book; and it
would also be a route to use for commercial or other more
permanent publication. The author is interested to hear of
TEI encodings of recipes as part of other projects.
Still to do
The astute reader will have noticed that the formal
list of ingredients is missing from the print version of
the test recipe in [Figure 6] (R).
The fully-formatted page in [Figure 6] (L) is generated by XSLT, so
performing a collation of the ingredients to populate the
list programmatically is possible. But the print view is
(by design, on this site) raw XML with CSS, as this was
the original requirement for the author’s collection.
Performing the CSS collation has been possible for recipes
using the regular DTD, as they occur inline to the normal
progression of events. Doing it out-of-line (using CSS)
for a narrative recipe needs further work, which is ongoing.
The current experiment was a proof-of-concept, and has been
mainly successful. If anyone has recipes encoded in, for
example, DocBook or JATS, I would be interested to hear from
them.
[Sperberg-McQueen 2002b] Sperberg-McQueen, Michael and Burnard, Lou (2002). Guidelines for Electronic Text Encoding and Interchange. TEI Consortium, Oxford, Providence, Charlottesville, Bergen, ISBN:095233013X.
Sperberg-McQueen, Michael and Burnard, Lou (2002). Guidelines for Electronic Text Encoding and Interchange. TEI Consortium, Oxford, Providence, Charlottesville, Bergen, ISBN:095233013X.