On 3/23/06, Ilmari Karonen <nospam(a)vyznev.net>
wrote:
And what
happens if the next edit merges some content back in from the
reverted text?
This case falls under "not perfect but as close as can be". It's
essentially the same problem as someone pasting content from another
article, or from another source entirely. Even your diff-based scheme,
while nifty indeed, doesn't solve that. In general, nothing can.
Well actually it does... because I proposed only classifying articles
which are completely disconnected from the main sub-graph as
non-contributors. The revert+remerge will either end up in the
entropy flow shortest path (if the removed text is smaller than the
preserved text), or as a little stub hanging off the main history flow
pathway should the diff to the reverted version be smaller.
Sorry, I meant that your solution _only_ handles the case where the
pasted-in content comes from a previously reverted version of the same
article. It doesn't handle the (probably even more common) similar
cases where the content comes from another article or from an off-wiki
source.
So both of our methods will miss some contributors. Yours will miss
slightly less than mine, at a cost of significantly more processing.
But neither is perfect, and no automatic method _can_ be perfect in this
regard, unless perhaps we were to somehow extend your entropy flow
analysis to the entire whole of human expression, including spoken word
and other ephemera.
"Who first came up with this?" is a Hard Problem(TM).
--
Ilmari Karonen