[Foundation-l] What's appropriate attribution?
Nikola Smolenski
smolensk at eunet.yu
Fri Oct 24 21:57:53 UTC 2008
On Friday 24 October 2008 20:44:31 Gregory Maxwell wrote:
> On Fri, Oct 24, 2008 at 1:18 PM, Nikola Smolenski <smolensk at eunet.yu> wrote:
> > On Friday 24 October 2008 01:19:20 phoebe ayers wrote:
> >> Pity the person who wants to reprint [[George W. Bush]] from en:wp...
> >> it has 13228 authors (6366 IP addresses!) Sure, most of them are
> >> vandalism, but I haven't seen any tool to pull out significant
> >> revisions. Does anyone know of such a tool or script?
> >
> > On Wikitech-l we just had thread WikiTrust and authorship that discussed
> > how such a tool could be made. It is doable.
>
> For copyright attribution purposes? Show me.
>
> Most greedy "auto-attributing" code I've seen has a tendency to
> incorrectly attribute text in cases of simple re-ordering. It's
That isn't the biggest of our concerns: it is acceptable that we have
occasional false positive (person who didn't make significant edits is listed
among the authors) rather than false negative (person who did make
significant edits is not listed among the authors).
A suggestion by Tei is simple and promising: simply make a list of all the
words in each version, sort it alphabetically, and make a diff. Number of
changed lines is number of changed words. Edits that changed only a few words
are not significant for our purpose.
> be used for that purpose. (Also, consider the case where half of an
> article is copy and paste moved from another article.) Not that it
And even that could be mostly identifiable, though it would use a lot of
resources. Fortunately, it happens relatively rarely.
More information about the foundation-l
mailing list