[WikiEN-l] Copyright Violation Bot

Andrew Gray shimgray at gmail.com
Sat Dec 23 12:42:07 UTC 2006


On 22/12/06, Neil Harris <usenet at tonal.clara.co.uk> wrote:
> geni wrote:
>
> [snip]
> > Certain searches of existing content would be useful the most obvious
> > being running a copy of the database against a copy of britianica.
> >
>
> And other databases of copyrighted texts, such as InfoTrac
> (http://www.gale.com/onefile/) or similar, and things like Google Book
> Search.

I wonder how practical it would be to find some way of running a
comparison search of a Wikipedia dump against a Britannica/Encarta/etc
CD? It could be done offline, and at leisure, for catching out past
offenders - I suspect the worst cases are those we don't catch, and
sit ignored and unrewritten for months. I know of at least one case
where we were getting copy-pasted articles from the OxDNB...

-- 
- Andrew Gray
  andrew.gray at dunelm.org.uk



More information about the WikiEN-l mailing list