On 22/12/06, Neil Harris usenet@tonal.clara.co.uk wrote:
geni wrote:
[snip]
Certain searches of existing content would be useful the most obvious being running a copy of the database against a copy of britianica.
And other databases of copyrighted texts, such as InfoTrac (http://www.gale.com/onefile/) or similar, and things like Google Book Search.
I wonder how practical it would be to find some way of running a comparison search of a Wikipedia dump against a Britannica/Encarta/etc CD? It could be done offline, and at leisure, for catching out past offenders - I suspect the worst cases are those we don't catch, and sit ignored and unrewritten for months. I know of at least one case where we were getting copy-pasted articles from the OxDNB...