Clarifying one small bit, the "copypatrol" tool was initially developed by Eran (a Wikimedia volunteer from Israel). It was than further developed by the Wikimedia Foundation. Agree that it is a great success, not only with respect to the final result but with respect to it being a successful collaborative project between the foundation and the community.
James
On Mon, Jun 17, 2019 at 10:36 AM Yaroslav Blanter ymbalt@gmail.com wrote:
Actually, I am afraid, for CCI at some point we will have to remove all added text by bot. I do not see any other scalable solution.
Cheers Yaroslav
On Mon, Jun 17, 2019 at 5:36 PM Stephen Philbrick < stephen.w.philbrick@gmail.com> wrote:
I have seen a couple comments on copyright issues in the last couple days so I thought I'd share some information that I think may be not
well-known
by everyone.
Very roughly, copyright issues (text) can be viewed in three categories:
- Addition of copyrighted material to articles in years past, not yet
removed (one-off) 2. Same as above, except by a serial violator 3. Close to real-time edits which may include copyrighted material
The reason for distinguishing these three categories is that our approach and success rates are very different.
In case 1, an editor identifies what they believe to be a copyright issue in an existing article. They can report it to
Wikipedia:Copyright_problems.
In the case of a single issue or a very small handful of issues, those items are identified and taken care of by volunteers. (I think this
aspect
is handled adequately — I used to be active there but haven't been recently)
The second case arises when a potential violation is identified. An examination of the editors contributions reveals many examples (typically five or more). If this occurs, it is referred to Wikipedia:Contributor copyright investigations. A CCI is opened, and the intent is to examine every single edit by that editor. This aspect is extremely backlogged.
I've
spent many hours working on CCI's, but it isn't easy, it isn't rewarding, and it is discouraging because I think the backlog is increasing rather than decreasing. (This isn't due to newly created copyright issues but newly found ones.)
The third case is handled by Copy Patrol, a foundation created tool that examines all new edits in close to real time and generates a report,
which
is handled by volunteers.
I want to emphasize this third aspect for multiple reasons. I think it is one of the least known tools. Some of the prior emails on the subject
leave
the impression that the authors are unaware of the existence of this
tool.
On the one hand, it works very well, as almost all of the several hundred reports each week are reviewed, most within 24 hours.
Good news:
- Copy Patrol is working, so my guess is that the growth in true
copyright
issues is close to nonexistent.
Bad news:
- Copy Patrol is adequately staffed but just barely. One editor is
responsible for the handling of far more than half of all of these
reports
(major kudos to Diannaa), but that much reliance on a single volunteer is not good for the long-term health of the project.
- The copy patrol tool is pretty good, and was being improved for a
while,
but I've identified some desirable improvements and my sense is that
it's a
very back burner project in terms of additional enhancements.
- CCI clearance is going to take many years
Phil (Sphilbrick) _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe