0

I am trying to figure out how to make it possible for a user to highlight and annotate an HTML document, marking sections with notes and indicating on the screen which text has been so marked. Think of it as a very simple version of the annotations functionality provided on the Kindle... the user selects some text, types a note and clicks "save", then the document highlights that text as well as all the other text the user has selected previously. An AJAX call ensures the server is made aware of the selection.

The trivial implementation seems to be to insert span tags within the document, applying a class to each section the user selects. This means that the document will grow over time, though I doubt the difference will be significant (maybe 8-10 selections per page). This is not ideal because I do not want to have to store the entire document back, I want to just send/store the starting and ending character index (or something like that) for each selected area.

The next logical level of complexity would be to store a list of start/end indices and add the tags to the HTML using jQuery after the raw document is displayed. I can do this, but the logic and math are a bit wonky and I would like to avoid it if possible. As the user clicks to commit a selection, I just compute the indices as they would be if nothing else in the document were selected, send that using AJAX, and apply the appropriate span tags to the marked-up document.

There will not be overlapping selected areas, so that keeps the logic reasonable... it just feels like there should be a better way. Ideas?

seawolf
  • 2,147
  • 3
  • 20
  • 37
  • personally spans sound like the best solution to me... keeping a separate list of char indices could go horribly wrong if the two separate datasets get out of synch... – Matt Coubrough May 30 '14 at 04:55

1 Answers1

1

Working with selections can be very painful because the browser sees all selections as relative to the enclosing node rather than the document, for example:

<p>An example <span>illustrating</span> sel|ections</p>
   01234567890      012345678901       012345678901

If the pipe character after "sel" represents the caret, then its position is viewed by the browser as index 4 into the "selections" text-node, rather than 26 or so characters into the containing paragraph or perhaps even more usefully, as its index in the containing document.

Since this is such a pain to deal with, I would recommend using Tim Down's Rangy library and using spans to highlight the content. here's a post about a similar requirement: https://stackoverflow.com/a/5765574/3651800

In a nutshell, Rangy has a cross-browser highlight module which applies spans to user selections:

var highlightApplier;

window.onload = function() {
    rangy.init();
    highlightApplier = rangy.createCssClassApplier("highlighted ", true);
};

function applyHighlight() {
    highlightApplier.applyToSelection();
}

If you later want to remove all highlights from your document and renormalize it, something like this should do the trick:

function unwrap(root, tagname) {    
    var elements = root.getElementsByTagName(tagname);
    var len = elements.length;
    var i;
    for( i=len-1; i >= 0; i--) {
        // work backwards to avoid complications with nested spans
        while(elements[i].firstChild) {
            elements[i].parentNode.insertBefore(elements[i].firstChild, elements[i]);            
        }
        var parent = elements[i].parentNode;
        parent.removeChild(elements[i]);
        parent.normalize(); 
    }
}

And to call it:

unwrap(rootElement, "span");

Really hope that helps you.

Community
  • 1
  • 1
Matt Coubrough
  • 3,739
  • 2
  • 26
  • 40