Balisage: One Href is not Enough

Introduction

I am a webaholic: the web has changed my life and it has changed the way I write: links are disruptive and my writing is no longer the same since I can use them.

Before the web (and before the links) you had to be very careful to be understood and introduce all the words that were not commonly known or disambiguate those that could be ambiguous.

Now that we have links, we can use them for these two purposes and concentrate on the message we express. This leads to a new conciseness that I love.

Unfortunately when you use links a lot you run rapidly into trouble...

The other day, I was writing a blog post to announce that my paper had been accepted at XPL Prague:

Just got the confirmation that I'll be presenting a paper on XQuery injection at XML Prague March 26th or 27th.

— See you in Prague

While typing the obvious question arose: where should I link "XQuery" to?

To Wikipedia which is usually a good choice because it provides cool URIs (that don't change) and pages that introduce a subject?
To the W3C recommendation which is another cool URI (that doesn't change) and is the normative reference but isn't an introductory material?
Elsewhere (to the XQuery tag on my blog, to the W3C XML Query Working Group,...)?

And, for versioned resources such as Wikipedia pages or W3C recommendations should I link to the current version at the date when I wrote the blog entry or to an updated, latest version?

All these choices make sense but (X)HTML imposes to chose one and only one target for a link!

The problem got worse when I was typing "XML Prague" because I had to choose between:

Linking "XML" and "Prague" separately (and again, to which target? Wikipedia, the W3C recommendation, the XML category for "XML"; Wikipedia, tourist office, ... for "Prague")
Linking "XML Prague" as a whole to the conference web site or the tag on my blog.

This issue of embedded links seems really tough and I think I could live with it but wanted to mention it for completeness.

The problem can also get worse when I write in French because I often want to give the choice between targets in French and targets in English when they are higher quality...

In other words: one href is not enough, we need n hrefs!

Am I asking too much?

I don't think so, my requirements are legitimate and generic: I want to be able to write simple sentences using the words that are relevant in my domain(s) while using links to give to my readers the ability to discover the meaning of the words that they don't know, browse authoritative resources to deepen or extend their knowledge or find out relative pages that I have written.

Furthermore this is an old issue already addressed in SGML world by HyTime and acknowledged by the W3C back in 1997!

What happened then?

The topic has always been considered touchy and the first working draft published in April 97 as "Extensible Markup Language (XML): Part 2. Linking" notes:

Please be advised that the draft you are now reading is unusually volatile. The debating and balloting process which determines the material contents is far from complete, and is nonetheless substantially ahead of the editing process that turns the material contents into usable specification language.

— Extensible Markup Language (XML): Part 2. Linking

The content was indeed so volatile that the specification was taken out of the XML recommendation and eventually became a recommendation no less that four years after in June 2001. This recommendation, known as XLink, does address what I need:

This specification defines the XML Linking Language (XLink), which allows elements to be inserted into XML documents in order to create and describe links between resources. It uses XML syntax to create structures that can describe links similar to the simple unidirectional hyperlinks of today's HTML, as well as more sophisticated links.

— XML Linking Language (XLink) Version 1.0

Unfortunately, without wanting to start a flame war nor blame anyone, I think it is fair to say that the syntax of these sophisticated links mentioned in this introduction and known as "extended links" is so complex that they are considered unusable by most of us XML geeks and have no chance to be embedded in real world (X)HTML pages. If you're not convinced by this bold statement, please hold on: I'll come back on extended XLinks in a while...

Is this topic doomed then? How can we go through when previous attempts seem to have all failed?

Ten years have passed since 2001 and one of the things we've learned is to hijack existing technologies to do what we need! Some hijacking technologies have even become de facto standards... Why not call them to the rescue?

In other words, why not use microformats, RDFa or HTML5's microdata to specify these "sophisticated links" that are missing to XML?

Requirements

Please take the remaining of this paper as a demonstration of how this problem could be handled rather than a final proposition...

The requirements that are chosen here are arbitrary: they meet what I find important as I write these lines and are subject to discussion but I am confident that the same method can be used with different requirements sets as long as they remain "reasonably" simple!

The requirements for this exercise can be summarized as defining a (X)HTML jargon (microformat, RDFa, microdata, ...) that:

Expresses inline links with multiple arcs between (X)HTML fragments and several link ends.
Can be processed by a simple JavaScript library to be displayed in a fancy way.
Degrades nicely and remains readable when not processed by such a library.
Plays well with search engines.
Do not requires server storage.
If possible, provides a way to annotate the arcs (to provide arc roles, the language of link ends or other informations).
If possible, support embedded links.

The general idea is to keep the thing as simple as possible while maintaining good practices!

Requirement 3 excludes solutions such as pluralink that package multiple links into a single href attribute and is not "degradable" since the link doesn't work if it isn't processed by a script.

Requirements 3 and 4 can be contradictory. Taken alone, point 3 would lead to defining a jargon that would replace "XQuery" by "XQuery [Wikipedia, W3C]" with links between the words "Wikipedia" and "W3C" and (respectively) the article about XQuery on Wikipedia and the XQuery W3C recommendation but the practice may be considered as an almost as poor as the infamous "Click here" practice!

Requirement 4 will thus lead to more verbose alternatives such as "XQuery [XQuery on Wikipedia, XQuery W3C Recommendation]" with links on "XQuery on Wikipedia" and "XQuery W3C Recommendation".

Requirement 5 excludes services such as http://www.multiurl.com/ that are similar to URL shorteners with the additional possibility to define multiple targets.

Note

This is a simplified set of requirements and that do not take into account chained links such as the relation between a page and its archive or translation. In this first version the arcs are between a document fragment and multiple resources that are all at the same level. In a next iteration, we'll have to see how this can be extended to introduce relations between linked resources.

First Step: Without Embedded Links

Let's first keep things simple and explore simple implementations for microformats, RDFa and microdata.

In each case, we will present the markup to express an nhrefs link and the corresponding JavaScript implementation.

This implementation will loop over nhrefs links and for each link it will hide the original markup but keep it intact so that other scripts could access the information for other purposes if that was necessary. For each link, a dialog will be created and a simple link will be added to open this dialog.

Kissing with Microformat

The good thing with microformats is that their "balisage" is flexible and they often can be kept as simple as possible...

In our case, the following seems to be good enough (indentation has been added to make the code more readable):

<span class="nhrefs">
    <span class="source">XQuery</span> 
    [
        <a href="http://en.wikipedia.org/wiki/XQuery" class="arc" rel="wikipedia">XQuery on Wikipedia</a>, 
        <a href="http://www.w3.org/TR/xquery/" class="arc" rel="authoritative">XQuery W3C Recommendation</a>
    ]
</span>

Where:

span.nhrefs	Is the container for an extended link.
span.source	Is the source of the link (the link start if you prefer). This source is always local to the document.
a.arc	Is an arc.
a.arc/@rel	Is the arc role (using curies and/or a set of well known common roles).
a.arc/@href	Is the URL of the arc destination.
a.arc/node()	Is the label of the arc end.

This format degrades reasonably well when it is not processed by any kind of script:

With a simple JavaScript function, this text can be streamlined into:

This script opens a dialog when you click on link that has been generated around the word "XQuery":

If you wonder the level of complexity of such a script, here is a version that uses jQuery (the code could probably be further simplified: I am not a jQuery expert):

jQuery(document).ready(function() {
 
    jQuery('.nhrefs')
        .each(function() {
    
            var span = jQuery(this);
            span.hide();
            var source = jQuery('.source', this).text();
            var link = jQuery(span.before('<a href="">'+ source +'</a>')[0].previousSibling);
            var dialog = jQuery(span.before('<div title="Links for &quot;'+ source + '&quot;"><ul /> </div>')[0].previousSibling);
            var list = jQuery('ul', dialog);
            jQuery('a.arc', this)
                .each(function(){
                    list.append('<li><a href="' + this.href + '">' + this.text + '</a>');
                });
            dialog.dialog({ autoOpen: false });
            link.click(function() {
                dialog.dialog("open");
                return false;
            }); 
        });

 });

Tripling with RDFa

The good thing with RDFa is that assertions can be extracted using any tool of a generic toolbox.

The price to pay is that your markup needs to follow a set of rules that are much more rigid than those of microformats...

In our case, here is the simplest markup I have been able to produce (enhancements welcome, especially if they simplify the source!):

<span typeof="nhrefs:link">
    <span property="nhrefs:source">XQuery</span> 
    <span rel="nhrefs:hasarc">
        [<span typeof="nhrefs:arc">
            <a href="http://en.wikipedia.org/wiki/XQuery" rel="nhrefs:dest" property="nhrefs:title">XQuery on Wikipedia</a> 
            <span rel="nhrefs:role" resource="nhrefs:wikipedia"><span>
        </span>, 
        <span typeof="typeof:arc">
            <a href="http://www.w3.org/TR/xquery/" rel="nhrefs:dest" property="nhrefs:title">XQuery W3C Recommendation</a>
            <span rel="nhrefs:role" resource="nhrefs:authoritative"></span>
        </span>]
    </span>
</span>

This code get displayed exactly like its microformat counterpart when it is not processed by a script.

Although this snippet is more verbose than its microformat equivalent, it is arguably more "auto documented" and any reader (human or not) familiar with RDFa can understand that we have here a "nhrefs:link" with a source and a couple of arcs...

Here is how Raptor RDF sees it (with some help from Graphviz):

More concisely, it can be represented in turtle as:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://www.w3.org/1999/xhtml> .
@prefix nhrefs: <http://nhrefs.org/> .

[]
    nhrefs:hasarc [
        nhrefs:dest <http://www.w3.org/TR/xquery/> ;
        nhrefs:role <nhrefs:authoritative> ;
        nhrefs:title "XQuery W3C Recommendation"
    ], [
        nhrefs:dest <http://en.wikipedia.org/wiki/XQuery> ;
        nhrefs:role <nhrefs:wikipedia> ;
        nhrefs:title "XQuery on Wikipedia" ;
        a nhrefs:arc
    ] ;
    nhrefs:source "XQuery" ;
    a nhrefs:link .

Nice, don't you think so?

To be honest, there is a flaw in this model: the arcs are embedded in a blank node without using any container and in that case RDF specifies that the triples are unordered. In other words, there is no guarantee that the relative order of the arcs will be kept.

Neither the current recommendation (RDFa1.0) nor the latest RDFa 1.1 Working Draft support containers but a proposal has been made on the RDFa wiki and I do hope that this most needed feature will be added to RDFa at some point.

This is only a problem as far as authors expect this order to be preserved (which is probably the case) and if we use a RDF library that may change this order (which is not the case of the library that we'll be using) but this is still a flaw.

A RDF library... Yep, let's see how you parse that kind of thing in JavaScript!

It could be tempting to use a library such as jQuery and just adapt what we've done for microformats to query the RDFa attributes in stead of the class attributes that drive microformats...

This would work on this example but unless you are ready to reimplement a RDFa parser that wouldn't work with models that would express the same set of triples using different RDFa syntaxes: even supporting a different namespace prefix than "nhrefs" would require extra work.

The best way to avoid these issues is to use a RDFa parser and, if you enjoy jQuery, Jeni Tennison's rdfQuery is definitely for you since it comes as a kind of jQuery add-on and shares its syntax.

RdfQuery also borrows a lot from SPARQL and to get the nhrefs links with their sources, you can write:

    var rdf = jQuery(document)
        .rdf()
        .prefix('nhrefs', 'http://nhrefs.org/')
        .where('?link a nhrefs:link')
        .where('?link nhrefs:source ?source');

In RdfQuery like in SPARQL, query results are sets of resources and literals rather than triples. These resources and literals cannot be mapped back to DOM nodes in the (X)HTML document and you need to go back to the triples for that.

In our case, the outer span element for the link is the element that carries the type information:

<span typeof="nhrefs:link">
...
</span>

A triple directly generated by this element is:

?link a nhrefs:link

And to get the span element (to hide it and prepend the dialog and replacement link), you can query this triple and use its source attribute:

    rdf
        .each(function(){
            var span = jQuery(rdf.reset().where(this.link.value + ' a nhrefs:link').sources()[0][0].source);
            span.hide();

After that, you can perform a sub query to find the arcs and create the dialog with the query results. The remaining of the function is straightforward and the complete code is:

jQuery(document).ready(function() {

    var rdf = jQuery(document)
        .rdf()
        .prefix('nhrefs', 'http://nhrefs.org/')
        .where('?link a nhrefs:link')
        .where('?link nhrefs:source ?source');
    rdf
        .each(function(){
            var span = jQuery(rdf.reset().where(this.link.value + ' a nhrefs:link').sources()[0][0].source);
            span.hide();
            var link = jQuery(span.before('<a href="">'+this.source.value+'</a>')[0].previousSibling);
            var dialog = jQuery(span.before('<div title="Links for &quot;'+ this.source.value + '&quot;"><ul /> </div>')[0].previousSibling);
            var list = jQuery('ul', dialog);
            rdf
                .reset()
                .where(this.link.value + ' nhrefs:hasarc ?arc')
                .where('?arc nhrefs:title ?title')
                .where('?arc nhrefs:dest ?dest')
                .each(function(){
                    list.append('<li><a href="' + this.dest.value + '">' + this.title.value + '</a>');
                });
            dialog.dialog({ autoOpen: false });
            link.click(function() {
                dialog.dialog("open");
                return false;
            }); 

            var x = this;
        });
});

Again, this code is more verbose than its microformat counterpart, but the links properties are accessed using proper queries over formal properties and that seems more robust than just relying on (X)HTML classes.

Bleeding with microdata

HTML5's microdata is arguably the most bleeding edge of these somewhat competing technologies. Although HTML5 isn't there yet, microdata can be used with libraries such as HTML5 Microdata JavaScript.

Some HTML5 specific features such as using meta elements within page bodies can't be used (because these elements are considered bogus and are stripped down by browsers) and need to be workaround. However, the result is still reasonably simple:

<span itemscope="itemscope" itemtype="http://nhrefs.org/link">
    <span itemprop="source">XQuery</span> 
    [<span itemscope="itemscope" itemprop="arc">
        <a href="http://en.wikipedia.org/wiki/XQuery" itemprop="dest">
            <span itemprop="title">XQuery on Wikipedia</span>
        </a> 
        <a href="http://nhrefs.org/wikipedia" itemprop="role" ></a>
    </span>, 
    <span itemscope="itemscope" itemprop="arc">
        <a href="http://www.w3.org/TR/xquery/" itemprop="dest">
            <span itemprop="title">XQuery W3C Recommendation</span>
        </a>
        <a href="http://nhrefs.org/authoritative" itemprop="role" ></a>
        <!--<meta itemprop="role" content="authoritative"/>-->
    </span>]
</span>

This code get displayed exactly like its microformat and RDFa counterparts when it is not processed by a script.

The microdata jQuery library is fairly simple to use and the code to process these links is very similar to what we've seen so far:

jQuery(document).ready(function() {

    jQuery(document)
        .items('http://nhrefs.org/link')
        .each(function(){
            var span = jQuery(this);
            span.hide();
            var source = span.properties('source').itemValue();
            var link = jQuery(span.before('<a href="">'+ source +'</a>')[0].previousSibling);
            var dialog = jQuery(span.before('<div title="Links for &quot;'+ source + '&quot;"><ul /> </div>')[0].previousSibling);
            var list = jQuery('ul', dialog);
            span
                .properties('arc')
                .each(function(){
                    var arc = jQuery(this);
                    list.append('<li><a href="' + arc.properties('dest').itemValue() + '">' + arc.properties('title').itemValue() + '</a>');
                });
            dialog.dialog({ autoOpen: false });
            link.click(function() {
                dialog.dialog("open");
                return false;
            }); 
        });


});

Why not extended XLinks after all?

Now that we've seen the level of simplicity (or complexity) of three different approaches, let's go back and revisit extended XLinks.

To express an extended link, you need to define :

The extended link itself that will serve as a container.
Link ends that can be either local to the link or external. In our case, the source (i.e. the span containing the text "XQuery") can be defined as a local resource and the targets will necessarily be defined as external resources (aka XLink "locators").
The arcs between the link ends.

As far as XLink is concerned, a simple way to define these links in a XHTML document could be:

<!-- An extended link -->
<span xlink:type="extended" 
      xlink:role="http://nhrefs.org/link/">
   <!-- Source (local resource) -->
   <span xlink:type="resource" 
         xlink:role="http://nhrefs.org/source/" 
         xlink:label="source">XQuery</span>
   <!-- Targets (remote resources aka locators) -->
   <span xlink:href="http://en.wikipedia.org/wiki/XQuery" 
         xlink:type="locator" 
         xlink:role="http://nhrefs.org/target/wikipedia/" 
         xlink:label="target" 
         xlink:title="XQuery on Wikipedia" > </span> 
   <span xlink:href="http://www.w3.org/TR/xquery/" 
         xlink:type="locator" 
         xlink:role="http://nhrefs.org/target/authoritative/" 
         xlink:label="target" 
         xlink:title="XQuery W3C Recommendation"> </span>
   <!-- Arcs -->
   <span xlink:type="arc" 
         xlink:from="source" 
         xlink:to="target"> </span>
</span>

As far as I understand the XLink recommendation, this is enough to express what we want. That's not so bad and we could argue that the level of complexity is similar to what we've seen so far.

Unfortunately, I am not aware of any existing implementation that can process this markup and display what we want to display. Browsers just ignore extended links and won't display anything more than the word "XQuery" from this markup.

To get a degraded display similar to what we had with microformats, RDFa or microdata, we need to repeat the target titles and href attributes :

<!-- An extended link -->
<span xlink:type="extended" 
       xlink:role="http://nhrefs.org/link/">
  <!-- The source -->
  <span xlink:type="resource" 
        xlink:role="http://nhrefs.org/source/" 
        xlink:label="source">XQuery</span> [
  <!-- The targets -->
  <a href="http://en.wikipedia.org/wiki/XQuery" 
     title="XQuery on Wikipedia"
     xlink:href="http://en.wikipedia.org/wiki/XQuery" 
     xlink:type="locator" 
     xlink:role="http://nhrefs.org/target/wikipedia/" 
     xlink:label="target" 
     xlink:title="XQuery on Wikipedia" >XQuery on Wikipedia</a>, 
  <a href="http://www.w3.org/TR/xquery/" 
     title="XQuery W3C Recommendation"
     xlink:href="http://www.w3.org/TR/xquery/" 
     xlink:type="locator" 
     xlink:role="http://nhrefs.org/target/authoritative/" 
     xlink:label="target" 
     xlink:title="XQuery W3C Recommendation">XQuery W3C Recommendation</a>]
  <!-- The arcs -->
  <span xlink:type="arc" 
        xlink:from="source" 
        xlink:to="target"> </span>
</span>

Here we have a XHTML fragment that will get displayed with the degraded display than we have requested in our requirements and has the meaning that we want to convey for XLink implementations.

The price to pay in term of complexity is clearly visible when we compare this fragment to what we've seen before.

In addition to the markup complexity, I am not aware of any JavaScript implementation of extended XLinks on which we can rely to process this fragment like we did for the other technologies and we might have to develop our own JavaScript implementation

If the downsides are clearly visible, the benefit is not that obvious!

Except for being proud to be conform to a W3C recommendation and hoping to convince more people to use it, what's the benefit of using a recommendation that has almost no traction?

Next Step: Embedding

A simple way to represent embedded links is to embed nhrefs links with the source property of another nhrefs link.

OK, but how should we present such embedded links to the user?

Taking back the example of "XML Prague", we could differentiate the link on "XML" that would present resources about XML and resources about XML Prague and the link on "Prague" that would present resources about Prague and resources about XML Prague. However, this would be displayed by the browser as one link (or at best two links separated by a space) and users would very likely miss the difference between these two links.

To avoid this issue, I suggest that we display the same dialog on all the terms of embedded links. That dialog will display all the links for all the terms but can group the links per term.

Microformat

Following these principles, the markup would be:

<span class="nhrefs">
    <span class="source">
        <span class="nhrefs">
            <span class="source">XML</span> 
            [
                <a href="http://en.wikipedia.org/wiki/XML" class="arc" rel="wikipedia">XML on Wikipedia</a>, 
                <a href="http://www.w3.org/XML/" class="arc" rel="informative">W3C XML Home Page</a>
                <a href="http://www.w3.org/TR/REC-xml/" class="arc" rel="authoritative">XML 1.0 recommendation</a>
            ]
        </span>
        <span class="nhrefs">
            <span class="source">Prague</span> 
            [
                <a href="http://en.wikipedia.org/wiki/Prague" class="arc" rel="wikipedia">Prague on Wikipedia</a>, 
                <a href="http://wikitravel.org/en/Prague" class="arc" rel="informative">Prague travel guide on Wikitravel</a>
            ]
        </span>
    </span> 
    [
        <a href="http://www.xmlprague.cz/" class="arc" rel="authoritative">XML Prague</a>, 
        <a href="http://www.xmlprague.cz/2011/index.html" class="arc" rel="authoritative">XML Prague 2011</a>
    ]
</span>

I must admit that the result becomes much less readable when it not processed by a script and that some CSS might be used to improve that:

Of course things get better after being processed by an updated version of the script:

RDFa

This can be ported to RDFa by creating bnodes as nhrefs:source that will themselves be nhrefs:links:

<span typeof="nhrefs:link">
    <span rel="nhrefs:source">
        <span typeof="nhrefs:link">
          <span property="nhrefs:source">XML</span> 
          <span rel="nhrefs:hasarc">
              [<span typeof="nhrefs:arc">
                  <a href="http://en.wikipedia.org/wiki/XML" rel="nhrefs:dest" property="nhrefs:title">XML on Wikipedia</a> 
                  <span rel="nhrefs:role" resource="nhrefs:wikipedia" ></span>
              </span>, 
              <span typeof="nhrefs:arc">
                  <a href="http://www.w3.org/XML/" rel="nhrefs:dest" property="nhrefs:title">W3C XML Home Page</a>
                  <span rel="nhrefs:role" resource="nhrefs:informative" ></span>
              </span>,
              <span typeof="nhrefs:arc">
                  <a href="http://www.w3.org/TR/REC-xml/" rel="nhrefs:dest" property="nhrefs:title">XML 1.0 Recommendation</a>
                  <span rel="nhrefs:role" resource="nhrefs:authoritative" ></span>
              </span>]
          </span>
         </span>
        <span typeof="nhrefs:link">
            <span property="nhrefs:source">Prague</span> 
            <span rel="nhrefs:hasarc">
                [<span typeof="nhrefs:arc">
                    <a href="http://en.wikipedia.org/wiki/Prague" rel="nhrefs:dest" property="nhrefs:title">Prague on Wikipedia</a> 
                    <span rel="nhrefs:role" resource="nhrefs:wikipedia" ></span>
                </span>, 
                <span typeof="nhrefs:arc">
                    <a href="http://wikitravel.org/en/Prague" rel="nhrefs:dest" property="nhrefs:title">Prague travel guide on Wikitravel</a>
                    <span rel="nhrefs:role" resource="nhrefs:informative" ></span>
                </span>]
            </span>
        </span>            
    </span>
    <span rel="nhrefs:hasarc">
        [<span typeof="nhrefs:arc">
            <a href="http://www.xmlprague.cz/" rel="nhrefs:dest" property="nhrefs:title">XML Prague</a> 
            <span rel="nhrefs:role" resource="nhrefs:authoritative" ></span>
        </span>, 
        <span typeof="typeof:arc">
            <a href="http://www.xmlprague.cz/2011/index.html" rel="nhrefs:dest" property="nhrefs:title">XML Prague 2011</a>
            <span rel="nhrefs:role" resource="nhrefs:authoritative" ></span>
        </span>]
    </span>
</span>

The model has now 41 triples and its graphical representation is hardly readable but its turtle representation is still readable:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://www.w3.org/1999/xhtml> .
@prefix nhrefs: <http://nhrefs.org/> .

[]
    nhrefs:hasarc [
        nhrefs:dest <http://www.xmlprague.cz/2011/index.html> ;
        nhrefs:role <nhrefs:authoritative> ;
        nhrefs:title "XML Prague 2011"
    ], [
        nhrefs:dest <http://www.xmlprague.cz/> ;
        nhrefs:role <nhrefs:authoritative> ;
        nhrefs:title "XML Prague" ;
        a nhrefs:arc
    ] ;
    nhrefs:source [
        nhrefs:hasarc [
            nhrefs:dest <http://en.wikipedia.org/wiki/XML> ;
            nhrefs:role <nhrefs:wikipedia> ;
            nhrefs:title "XML on Wikipedia" ;
            a nhrefs:arc
        ], [
            nhrefs:dest <http://www.w3.org/XML/> ;
            nhrefs:role <nhrefs:informative> ;
            nhrefs:title "W3C XML Home Page" ;
            a nhrefs:arc
        ], [
            nhrefs:dest <http://www.w3.org/TR/REC-xml/> ;
            nhrefs:role <nhrefs:authoritative> ;
            nhrefs:title "XML 1.0 Recommendation" ;
            a nhrefs:arc
        ] ;
        nhrefs:source "XML" ;
        a nhrefs:link
    ], [
        nhrefs:hasarc [
            nhrefs:dest <http://wikitravel.org/en/Prague> ;
            nhrefs:role <nhrefs:informative> ;
            nhrefs:title "Prague travel guide on Wikitravel" ;
            a nhrefs:arc
        ], [
            nhrefs:dest <http://en.wikipedia.org/wiki/Prague> ;
            nhrefs:role <nhrefs:wikipedia> ;
            nhrefs:title "Prague on Wikipedia" ;
            a nhrefs:arc
        ] ;
        nhrefs:source "Prague" ;
        a nhrefs:link
    ] ;
    a nhrefs:link .

Of course, we are bitten again by the same limitation: the links that compose sources are unordered and in theory there is no guarantee that when we generate the title for composed links we won't generate "Prague XML" instead of "XML Prague"!

The JavaScript is 80 lines long (compared to 30).

Microdata

This can be ported to microdata:

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:nhrefs="http://nhrefs.org/">

<span itemscope="itemscope" itemtype="http://nhrefs.org/link">
    <span itemprop="source" itemscope="itemscope" itemtype="http://nhrefs.org/link">
        <span itemprop="source">XML</span> 
        [<span itemscope="itemscope" itemprop="arc">
            <a href="http://en.wikipedia.org/wiki/XML" itemprop="dest">
                <span itemprop="title">XML on Wikipedia</span>
            </a> 
            <a href="http://nhrefs.org/wikipedia" itemprop="role" ></a>
        </span>, 
        <span itemscope="itemscope" itemprop="arc">
            <a href="http://www.w3.org/XML" itemprop="dest">
                <span itemprop="title">W3C XML Home Page</span>
            </a>
            <a href="http://nhrefs.org/informative" itemprop="role" ></a>
        </span>,
        <span itemscope="itemscope" itemprop="arc">
            <a href="http://www.w3.org/TR/REC-xml/" itemprop="dest">
                <span itemprop="title">XML W3C Recommendation</span>
            </a>
            <a href="http://nhrefs.org/authoritative" itemprop="role" ></a>
        </span>]
    </span> 
    <span itemprop="source" itemscope="itemscope" itemtype="http://nhrefs.org/link">
        <span itemprop="source">Prague</span> 
        [<span itemscope="itemscope" itemprop="arc">
            <a href="http://en.wikipedia.org/wiki/Prague" itemprop="dest">
                <span itemprop="title">Prague on Wikipedia</span>
            </a> 
            <a href="http://nhrefs.org/wikipedia" itemprop="role" ></a>
        </span>, 
        <span itemscope="itemscope" itemprop="arc">
            <a href="hhttp://wikitravel.org/en/Prague" itemprop="dest">
                <span itemprop="title">Prague travel guide on Wikitravel</span>
            </a>
            <a href="http://nhrefs.org/informative" itemprop="role" ></a>
        </span>]
    </span> 
    [<span itemscope="itemscope" itemprop="arc">
        <a href="http://www.xmlprague.cz/" itemprop="dest">
            <span itemprop="title">XML Prague</span>
        </a> 
        <a href="http://nhrefs.org/authoritative" itemprop="role" ></a>
    </span>, 
    <span itemscope="itemscope" itemprop="arc">
        <a href="http://www.xmlprague.cz/2011/index.html" itemprop="dest">
            <span itemprop="title">XML Prague 2011</span>
        </a>
        <a href="http://nhrefs.org/authoritative" itemprop="role" ></a>
    </span>]
</span>

The JavaScript is now 77 lines long (compared to 24).

Next Steps

All three techniques provide a lightweight solution to express links with multiple arcs that are easy to parse in JavaScript. Now, what can we do with all these angle brackets?

The first conclusion is that for this application there is no clear winner between microformats, RDFa and microdata:

Microformats are less verbose and more "free style". The price to pay is that you need to read the spec to understand the structure of each of them and need to use DOM level methods to get your information.
Microdata and RDFa have roughly the same level of verbosity.
RDFa and microdata are more rigid and more verbose. The benefit is that if you use the right library you can parse their structure with higher level methods.
In theory, RDFa doesn't preserve the relative order between arcs and multi part sources.
Microdata isn't at recommendation stage yet and may change.
With RDFa, it is straightforward to extract link information as triples and use semantic web tools to do all kind of funky things with them.
In the future, microdata will probably be natively supported by browsers.

The most sensible choice is probably to make no choice and support all three technologies!

OK, but what can we do with all these angle brackets?

The markup should be further documented and it can be seen as an open API between:

Consumers (such as the scripts that have been presented here) that parse the markup to do all kind of interesting things.
Producers that write this markup which isn't really fun to write by hand.

The consumers that we've seen should be documented and tested before they can be considered really usable.

Producers need to be implemented. Producers for popular web publishing platforms would be especially useful. For these platforms, two kind of publishers could be developed:

Transformers that transform other markup into one of these three formats. In WordPress for instance nhrefs links could be expressed using shortcodes in the posts.
GUI that let user create nhrefs links is a friendly way.

Producers and consumers could also be packaged as plug-ins for web publishing platforms. Such a plug-in would contain:

A producer to facilitate the production of nhrefs markup by the platform.
The JavaScript to display the links on the browser.

This is more or less my roadmap for this project. If you are interested, watch this space: http://nhrefs.org!

BalisageThe Markup Conference2011

Balisage Paper: One Href is not Enough

We need n hrefs!

Abstract

Table of Contents