[Foundation-l] What's appropriate attribution?

Nikola Smolenski smolensk at eunet.yu
Fri Oct 24 21:57:53 UTC 2008


On Friday 24 October 2008 20:44:31 Gregory Maxwell wrote:
> On Fri, Oct 24, 2008 at 1:18 PM, Nikola Smolenski <smolensk at eunet.yu> wrote:
> > On Friday 24 October 2008 01:19:20 phoebe ayers wrote:
> >> Pity the person who wants to reprint [[George W. Bush]] from en:wp...
> >> it has 13228 authors (6366 IP addresses!) Sure, most of them are
> >> vandalism, but I haven't seen any tool to pull out significant
> >> revisions. Does anyone know of such a tool or script?
> >
> > On Wikitech-l we just had thread WikiTrust and authorship that discussed
> > how such a tool could be made. It is doable.
>
> For copyright attribution purposes?  Show me.
>
> Most greedy "auto-attributing" code I've seen has a tendency to
> incorrectly attribute text in cases of simple re-ordering.  It's

That isn't the biggest of our concerns: it is acceptable that we have 
occasional false positive (person who didn't make significant edits is listed 
among the authors) rather than false negative (person who did make 
significant edits is not listed among the authors).

A suggestion by Tei is simple and promising: simply make a list of all the 
words in each version, sort it alphabetically, and make a diff. Number of 
changed lines is number of changed words. Edits that changed only a few words 
are not significant for our purpose.

> be used for that purpose. (Also, consider the case where half of an
> article is copy and paste moved from another article.)  Not that it

And even that could be mostly identifiable, though it would use a lot of 
resources. Fortunately, it happens relatively rarely.



More information about the foundation-l mailing list