Re: [Foundation-l] RfC: License update proposal

4 Feb 2009

Here is a WikiBlame tool that serves as a demo:
http://wikipedia.ramselehof.de/wikiblame.php
I've come up with an algorithm to speed up the search when you don't know
the article title (a case this doesn't handle) but you can't get around
needing a monster index.

The easiest way to do this is to make the LuceneSearch extension grok the
full history dump and then layer the search algorithm on top of it based on
standard Lucene search.

On Sun, Feb 1, 2009 at 11:07 PM, phoebe ayers &lt;phoebe.wiki(a)gmail.com&gt; wrote:

...
  On Wed, Jan 21, 2009 at 4:36 PM, Thomas Dalton
&lt;thomas.dalton(a)gmail.com&gt;
 wrote:
  2009/1/22 Erik Moeller
&lt;erik(a)wikimedia.org&gt;rg>:
  Because I don't think it's good to
discuss attribution as an abstract
 principle, just as an example, the author attribution for the article
 [[France]] is below, excluding IP addresses. According to the view
 that attribution needs to be given to each pseudonym, this entire
 history would have to be included with every copy of the article.
 Needless to say, in a print product, this would occupy a very
 significant amount of space. Needless to say, equally, it's a
 significant obligation for a re-user. And, of course, Wikipedia keeps
 growing and so do its attribution records. 
 Well, the attribution list is about 1/6 the length of the article (in
 terms of bytes). Given that it can be in significantly smaller font
 size, doesn't have lots of whitespace and has no images, it's going to
 take up far less than 1/6 as much space on the page. It will be a
 significant amount of space, but not an impractical one (to the extent
 that copying and pasting into Word gives meaningful results, the
 article takes up 35 pages, the attribution list takes up 2). 

 Which is fine if you're reprinting the whole article, but what if
 you're just reprinting the lede, or some other section of an article?
 Should a reuser still be required to reprint 2 pages of credits for a
 paragraph of article? That seems onerous. Note that just reprinting a
 *section* of an article is how many print reuse cases have worked to
 date (the German encyclopedia and our CafePress bumperstickers come to
 mind), and this case is not something that we've discussed much so
 far.

 And having just actually done this, with a real book and a real
 publisher, in "How Wikipedia Works," I can attest that it's a
 non-trivial amount of work to get author lists for articles --
 removing duplication, IPs, formatting, etc is all a good deal of work
 -- and I like to think I understand how histories work. It would be a
 much bigger task for someone who didn't understand histories or the
 license.

 The Wikiblame tool, if it were made widely accessible and prominently
 integrated into the site, seems like a promising solution. In the
 meantime, I think we ought to consider what "proper credit" is for
 just reusing a part of an article, versus the whole thing.

 -- phoebe

 _______________________________________________
 foundation-l mailing list
 foundation-l(a)lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Foundation-l] RfC: License update proposal