On Wed, Nov 13, 2013 at 12:39 PM, Chris McKenna <cmckenna(a)sucs.org> wrote:
The problem isn't that we're waiting for perfection. We're waiting for the
proportion of false positives and false negatives to fall to a level where
don't overwhelm the true positives.
To avoid false positives from mirrors, the best option is to compare a text
as soon as it is saved. Also, you exclude certain websites from the
comparison because you know they're the mirrors, you exclude rollbacks, ...
Then, it is better to have a human checking that it is really a copyvio (it
could well be a public domain text, or another Wikipedia article).
Marco