[Wikimedia-l] Copyright infringement - The real elephant in the room

Matthew Flaschen matthew.flaschen at gatech.edu
Wed Nov 13 10:23:38 UTC 2013


On 11/13/2013 05:16 AM, Philippe Beaudette wrote:
> On Wed, Nov 13, 2013 at 2:37 AM, Matthew Flaschen <
> matthew.flaschen at gatech.edu> wrote:
>
>> A significant problem with TurnItIn is that is proprietary, and can not be
>> customized by anyone in the movement.  The fact that it is proprietary also
>> means it can never be port of the main infrastructure, nor run on Wikimedia
>> Labs.
>
>
> Another significant issue is the "False Positive" factor that is created by
> our overwhelming popularity.  Frankly, we're mirrored all over the place.
> And tools like Turnitin find the mirrors too.  It's not an easy problem to
> solve.  I was on the team that looked at this a couple of years back - it's
> just not simple, and there are complex challenges.

Yes, an intelligent solution would take into account when the mirror was 
first indexed (or ideally first published), and when the Wikipedia 
article was edited, to reduce false positives requiring manual intervention.

Matt Flaschen




More information about the Wikimedia-l mailing list