Most of the copyvios aren't from other encyclopedias.
Also, I get a copyvio 1 in 10 times... that is way, way too many. I'd have hundreds of edits if I ran this for hours a day.
People don't realize how serious a problem this is... I've been noticing this for months, and I want to give Earle Martin a million barnstars for writing this (it inspired me to start learning Perl so I can write an automated, newpages based one... hint hint)
--Chris
On 12/23/06, Andrew Gray shimgray@gmail.com wrote:
On 22/12/06, Neil Harris usenet@tonal.clara.co.uk wrote:
geni wrote:
[snip]
Certain searches of existing content would be useful the most obvious being running a copy of the database against a copy of britianica.
And other databases of copyrighted texts, such as InfoTrac (http://www.gale.com/onefile/) or similar, and things like Google Book Search.
I wonder how practical it would be to find some way of running a comparison search of a Wikipedia dump against a Britannica/Encarta/etc CD? It could be done offline, and at leisure, for catching out past offenders - I suspect the worst cases are those we don't catch, and sit ignored and unrewritten for months. I know of at least one case where we were getting copy-pasted articles from the OxDNB...
--
- Andrew Gray andrew.gray@dunelm.org.uk
WikiEN-l mailing list WikiEN-l@Wikipedia.org To unsubscribe from this mailing list, visit: http://mail.wikipedia.org/mailman/listinfo/wikien-l