Hi Luca,
we are working on somewhat related issues in Parsoid [1][2]. The modified HTML DOM is diffed vs. the original DOM on the way in. Each modified node is annotated with the base revision. We don't store this information yet- right now we use it to selectively serialize modified parts of the page back to wikitext. We will however soon store the HTML along with wikitext for each revision, which should make it possible to display a coarse blame map.
There are several limitations:
* We don't preserve blame information on wikitext edits yet. This should become possible with the incremental re-parsing optimization which is on our roadmap for this summer.
* Our DOM diff algorithm is extremely simplistic. We are considering to port XyDiff for better move detection.
* The information is pretty coarse at a node level. Refining this to a word level would require an efficient encoding for that information, possibly as length/revision pairs associated with the wrapping element.
* We have not moved metadata from attributes to a metadata section with efficient encoding yet.
We don't currently plan to work on blame maps ourselves. Maybe there are opportunities for collaboration?
Gabriel
[1]: http://www.mediawiki.org/wiki/Parsoid [2]: http://www.mediawiki.org/wiki/Parsoid/Roadmap