I'm about to try to implement a view similar to the one I've already posted but instead of displaying the ownership of raw text, it will display ownership of rendered text. The raw-text ownership extension already provides an association between raw wiki text and its owner. I am looking for advice on how to proceed. I have a couple of ideas.
The first is to modify the association as the parser does its thing, performing the same substitutions as the parser. The difficulty is that the parser would have to relay precisely what text it has replaced and I am not sure how this would work. I have considered using offset from beginning of the text block but this seems to be difficult, problematic, and finnicky.
In addition to keeping an association between raw wiki text and authorship, an association between parsed text and ownership could be constructed. In this case the word-level (or char level or whatever) diff engine would compare parsed texts, not raw texts. In this case, a post processor could modify the parsed text (just ignoring html tags) to mark up who authored it. I am not at this time concerned with the problem of the performance of the diff engine.
Ideas are welcome.
wikitech-l@lists.wikimedia.org