[Foundation-l] RfC: License update proposal

Brian Brian.Mingus at colorado.edu
Wed Feb 4 08:47:36 UTC 2009


Here is a WikiBlame tool that serves as a demo:
http://wikipedia.ramselehof.de/wikiblame.php
I've come up with an algorithm to speed up the search when you don't know
the article title (a case this doesn't handle) but you can't get around
needing a monster index.

The easiest way to do this is to make the LuceneSearch extension grok the
full history dump and then layer the search algorithm on top of it based on
standard Lucene search.

On Sun, Feb 1, 2009 at 11:07 PM, phoebe ayers <phoebe.wiki at gmail.com> wrote:

> On Wed, Jan 21, 2009 at 4:36 PM, Thomas Dalton <thomas.dalton at gmail.com>
> wrote:
> > 2009/1/22 Erik Moeller <erik at wikimedia.org>:
> >> Because I don't think it's good to discuss attribution as an abstract
> >> principle, just as an example, the author attribution for the article
> >> [[France]] is below, excluding IP addresses. According to the view
> >> that attribution needs to be given to each pseudonym, this entire
> >> history would have to be included with every copy of the article.
> >> Needless to say, in a print product, this would occupy a very
> >> significant amount of space. Needless to say, equally, it's a
> >> significant obligation for a re-user. And, of course, Wikipedia keeps
> >> growing and so do its attribution records.
> >
> > Well, the attribution list is about 1/6 the length of the article (in
> > terms of bytes). Given that it can be in significantly smaller font
> > size, doesn't have lots of whitespace and has no images, it's going to
> > take up far less than 1/6 as much space on the page. It will be a
> > significant amount of space, but not an impractical one (to the extent
> > that copying and pasting into Word gives meaningful results, the
> > article takes up 35 pages, the attribution list takes up 2).
>
>
> Which is fine if you're reprinting the whole article, but what if
> you're just reprinting the lede, or some other section of an article?
> Should a reuser still be required to reprint 2 pages of credits for a
> paragraph of article? That seems onerous. Note that just reprinting a
> *section* of an article is how many print reuse cases have worked to
> date (the German encyclopedia and our CafePress bumperstickers come to
> mind), and this case is not something that we've discussed much so
> far.
>
> And having just actually done this, with a real book and a real
> publisher, in "How Wikipedia Works," I can attest that it's a
> non-trivial amount of work to get author lists for articles --
> removing duplication, IPs, formatting, etc is all a good deal of work
> -- and I like to think I understand how histories work. It would be a
> much bigger task for someone who didn't understand histories or the
> license.
>
> The Wikiblame tool, if it were made widely accessible and prominently
> integrated into the site, seems like a promising solution. In the
> meantime, I think we ought to consider what "proper credit" is for
> just reusing a part of an article, versus the whole thing.
>
> -- phoebe
>
> _______________________________________________
> foundation-l mailing list
> foundation-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>



More information about the wikimedia-l mailing list