Robert Michel wrote:
PS: Search engines like Yahoo could be a good partner to get support to use their server more than 1000 times a day, for example with a script which check every new entry if it was only a copy&paste from one webpage or newsposting. Do you think when the Wikipedia Foundation would request for support, this would be realistic? E.G. Yahoo should has been an interest of their own that we don`t use copy right protectet material ;)
Sorry for the 'me too', but...
I think this is a great idea. :)
As the volume of edits increase, being able to check new contributions for copyright violations becomes more important since even if a very small percentage of edits have copyvios, that still will be an absolutely large amount of text given enough time.
Such a thing could poison and potentially doom any chance for a print version of any of our content. Fixing a copyvio on the wiki after being notified is easy. Fixing a copyvio printed on 50,000 print copies would be impossible.
A bot checking each new article and each new major diff for copyvios would be very helpful. It could be written so that anything from websites known to host public domain or free content could be excluded from being added to a suspected copyvio list. Humans would still have to go through that list and make decisions on whether or not the suspected copyvios are in fact copyvios. Feeback from that could be used to further refine the copyvio bot.
But yes, that would require more than 1000 searches a day from a single IP address. We could get around that by coordinating copyvio bot use by several users from each wiki, but IMO a better way to do it would be to have the bot become part of MediaWiki and for the service to be run by the Wikimedia servers. That will require special permission to exceed the 1000 search a day limit.
-- Daniel Mayer (aka mav)
__________________________________ Do you Yahoo!? Yahoo! Search - Find what you�re looking for faster http://search.yahoo.com
Daniel Mayer wrote:
Robert Michel wrote:
PS: Search engines like Yahoo could be a good partner to get support to use their server more than 1000 times a day, for example with a script which check every new entry if it was only a copy&paste from one webpage or newsposting.
But yes, that would require more than 1000 searches a day from a single IP address.
As already mentioned on the german list, Google offers an increased query limit for their free service. I wrote an email months ago and their answer was, that they need a description of the project/program to decide. I'm sure that they would give a more-queries-donation to Wikipedia.
We could get around that by coordinating copyvio bot use by several users from each wiki, but IMO a better way to do it would be to have the bot become part of MediaWiki and for the service to be run by the Wikimedia servers. That will require special permission to exceed the 1000 search a day limit.
I agree that such a bot would be preferrable a part of the Wikimedia- software, but i don't think that a service like e.g. Google would give us so much queries per day that a check of every edit is possible. A kind of distributed copyright-check is in discussion for de:.
Regards, Nils.
wikipedia-l@lists.wikimedia.org