[WikiEN-l] Copyright Violation Bot

geni geniice at gmail.com
Thu Nov 23 15:59:22 UTC 2006


On 11/23/06, Chris Picone <ccool2ax at gmail.com> wrote:
> I've been seeing a rise in in-article copyvios. Last night I got one
> in [[Content managment system]]. I know that only some paragrahs have
> these copyvios, and not entire articles, so complete rewirtes aren't
> necessary. Thus, I'm attempting to write a script that
> (a) opens tabs with "Special:Random" on them
> (b) select the first setence from each paragraph (line break)
> (c) Google the sentence
> (d) If there are any exact matches not from en.wikipedia.org, put up a
> little message for me to check and remove the copyvio.
> (e) repeat.
>
> Problem is, all I know is Applescript. If any of you Perl or
> pywikipedia or AWB-types have another way of writing this, can someone
> write it so the general community can use it to remove copyvios? (or
> is this possible with AWB?)
>
> Chris (Ccool2ax)

Your really killer is that google limits the number of searches per
IP. The other issues is the first senace tends to be the one most
likely to be altured. Not saying it's imposible just rather hard to
do.

-- 
geni



More information about the WikiEN-l mailing list