[Wikimedia-l] Copyright infringement - The real elephant in the room

Andrew Gray andrew.gray at dunelm.org.uk
Tue Nov 19 13:12:57 UTC 2013


It could use abuse-filter tags, just not in an entirely standard way:

* Bot scans edit X
* Script flags it as a problem
* Bot makes edit X+1 to page (perhaps adding copyvio template?) which
triggers an abusefilter rule for (if this bot and does such-and-such
an edit) and tags it.

The offending edit itself won't be tagged, but the page history will
and it can probably be spotted quite easily from there.

A.

On 19 November 2013 01:07, Matthew Flaschen <mflaschen at wikimedia.org> wrote:
> On 11/16/2013 09:04 AM, Anthony Cole wrote:
>>
>> The problem of false positives from mirrors doesn't exist if we scan edits
>> as they are made.
>
>
> Agreed.  However, that example is a legal, attributed (at least on the talk
> page) copy from a third-party freely licensed text, not a false positive
> copy from a Wikipedia mirror.
>
>> Maggie says
>> here<https://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard#Emergency_block_of_an_editor_with_which_I_have_been_previously_involved>that
>> copyright bots populate
>> WP:SCV <https://en.wikipedia.org/wiki/Wikipedia:SCV> So a
>> similarly-configured bot could scan recent changes and tag suspected
>> copyvios in watchlists and page histories like suspected vandalism is
>> currently tagged.
>
>
> The suspected vandalism checks that actually tag the edit (e.g. "Tag:
> possible vandalism")  are based on AbuseFilter checks.  These are relatively
> fast determinations that consider the text of the edit (e.g. regexes for
> strings of curse words, or meaningless repeating characters), and
> comparisons to the previous version (blanked the section, blanked the page).
>
> As far as I know, regular AbuseFilter rules can not hit a database or web
> search to check for copyright violations.  An extension could in theory do
> this.  But there would possibly be performance problems, since AbuseFilter
> runs on the actual server (not just some bot's computer) on every edit.
>
> It is possible for a bot to scan every edit; it just can't use AbuseFilter
> tags.
>
> Matt Flaschen
>
> _______________________________________________
> Wikimedia-l mailing list
> Wikimedia-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request at lists.wikimedia.org?subject=unsubscribe>



-- 
- Andrew Gray
  andrew.gray at dunelm.org.uk



More information about the Wikimedia-l mailing list