Hoi
I know several authors who publish and use their original text to publish
on Wikipedia as well.. This is another source of false positives because
they have the copyright to the original source... To recognise this you
have to be even more sophisticated.
The point I want to make is that having a tool that is KNOWN to be
deficient in specific ways can still be a huge advantage over not having a
tool at all. So PLEASE lets not make perfection the enemy of the good.
Thanks,
GerardM
On 13 November 2013 11:23, Matthew Flaschen <matthew.flaschen(a)gatech.edu>wrote;wrote:
On 11/13/2013 05:16 AM, Philippe Beaudette wrote:
On Wed, Nov 13, 2013 at 2:37 AM, Matthew Flaschen
<
matthew.flaschen(a)gatech.edu> wrote:
A significant problem with TurnItIn is that is proprietary, and can not
be
customized by anyone in the movement. The fact that it is proprietary
also
means it can never be port of the main infrastructure, nor run on
Wikimedia
Labs.
Another significant issue is the "False Positive" factor that is created
by
our overwhelming popularity. Frankly, we're mirrored all over the place.
And tools like Turnitin find the mirrors too. It's not an easy problem to
solve. I was on the team that looked at this a couple of years back -
it's
just not simple, and there are complex challenges.
Yes, an intelligent solution would take into account when the mirror was
first indexed (or ideally first published), and when the Wikipedia article
was edited, to reduce false positives requiring manual intervention.
Matt Flaschen
_______________________________________________
Wikimedia-l mailing list
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>