LMNL, Layered Markup and Annotation Language, is a markup technology designed to support free-form markup of textual structures. Think of XML -- tags and text -- and then imagine an approach to markup that gives you several things that XML does not:
Overlap is possible. This is because LMNL uses a range model, and ranges may overlap freely with one another.
A range identifies an arbitrary sequence of consecutive characters in the text. When using LMNL syntax, ranges are marked with tags much in the way elements are marked in XML. But while elements in XML must nest inside one another, a LMNL range can be any sequence of characters, irrespective of whether any or all of them also appear in other ranges -- nested, overlapping or whatever. When tagging a LMNL document, at any point in the text you can close any open range, as opposed to XML, in which the only element you can close is the one that is currently open.
[1] Bah! It takes the form of an (anonymous) annotation on a range marked noted. [wap]LMNL supports structured annotation. [2] For example, this bit of text is in an annotation, and it has its own inline markup. [wap] Any range, once identified, can be annotated.[1] LMNL annotations are to LMNL ranges what attributes are to elements in XML; but in XML, attributes are simply name-value pairs, whereas in LMNL, any annotation can be a miniature document, supporting everything that a LMNL document supports.[2] So its content may be marked up and [3] When we presented the paper at Balisage, we edited this document live and showed Luminescent parsing and processing the raw markup. [wap]annotated freely, if need be[3]; annotations may also be annotated. Consequently, annotations become first-class structures in themselves, and your markup of your document (which ranges you identify and how you annotate them) can be as expressive as you want it to be.
A couple of other differences between LMNL annotations and XML attributes may also be useful: annotations in LMNL are ordered, and more than one annotation of the same name can be given to the same range or annotation.
Both ranges and annotations in LMNL may be anonymous: they don't have to have names at all. While probably this isn't important for purposes of markup as such, it will be useful when integrating systems that attach annotations to arbitrary bits of text; even if these are identified only with their own metadata (that is, the ranges to which they belong have no type names within a local schema), LMNL can accommodate them.
Finally, LMNL has atoms. These are things in your text (graphics, glyphs not in your character set, arbitrary artifacts of any type whatsoever) for which you have no character, but which you wish to represent anyway. In fact, in LMNL, even plain alphanumeric characters are just a special kind of atom.
Since LMNL's range model has no proper notion of “containment” (ranges can cover arbitrary spans of text, so their own relations are incidental and not part of the model), atoms become especially important for representing things that might be represented, in XML, by empty elements. (That is, ranges can't appear inside ranges, but atoms can.)
LMNL's range model represents a document without committing it to any single hierarchical structure -- or any hierarchy at all. That is to say, there may be hierarchies of elements “there” in the document (there usually are), but as far as LMNL is concerned, they are all implicit and “potential”. A processor might see a hierarchy, when ranges happen to be arranged this way, for purposes of validation or of extraction (induction) into XML. [4] There is no problem overlapping note ranges with paragraph ranges, since they're all ranges. [wap]We hope and expect that capabilities like this will help us bridge the gap to general-purpose LMNL processing, with query and transformation technologies designed to work specifically with the LMNL model.
LMNL can be represented in XML using any of the common workarounds for representing overlap, if and when that may be necessary[4]; in this case, a conversion from XML into the LMNL model that makes special provision for these conventions (implemented in XSLT or the transformation technology of your choice) can get you into LMNL. But LMNL also has a syntax of its own (which is fairly simple in design), which we call sawtooth syntax due to the way it looks (tags have “teeth” that grip the text).
A prototype parser for sawtooth syntax (implemented, as it happens, as an XSLT 2.0 upconversion pipeline) is available. This text was authored using sawteeth: for its source, see demo.lmnl.