I am currently reviewing work on spam detection on
Wikipedia. West et al.
(2011) <https://dl.acm.org/doi/pdf/10.1145/2038558.2038574> found that *the
length (in characters) of the revision summary* was one of the features
with the greatest weight in the final classifier.
Oh yeah, adding some quantitative evidence to what Jonathan pointed out
about blank edit summaries being a useful signal: "Some 88% of spam leaves
[the edit summary] blank..." which they indicate is in comparison to only
17% of external link additions by trusted users being without an edit
summary.
On Thu, Aug 5, 2021 at 5:18 AM Pablo Aragón <paragon(a)wikimedia.org> wrote:
> Hi Isaac,
>
I am currently reviewing work on spam detection on
Wikipedia. West et al.
> (2011) <https://dl.acm.org/doi/pdf/10.1145/2038558.2038574> found that
> *the
> length (in characters) of the revision summary* was one of the features
> with the greatest weight in the final classifier.
>
> Best,
>
> On Wed, Aug 4, 2021 at 11:46 PM Isaac Johnson <isaac(a)wikimedia.org> wrote:
>
> > Thanks all for the feedback! If anyone thinks of more, by all means send
> > over.
> >
> > > 1. One of the reasons why any suggestion that we make edit summaries
> > compulsory is that as long as they are optional, blank edit summaries
> are a
> > great way to identify vandals.
> > This is a pretty interesting point. For further context, I'm asking
> because
> > I'm mentoring a researcher who will be looking into edit summary usage
> and
> > I wanted to make sure we weren't asking questions that had already been
> > answered elsewhere. The research is still in the formative stages of
> > figuring out what additional research might be useful and just having a
> > better understanding of the distribution of edit types. When I think of
> > tools / interventions based on what little I know, however, it's mainly
> > along the lines of what sorts of edit tags (or similar filters) could be
> > auto-generated to further contextualize edit summaries. Helping editors
> > quickly match their edit to templated/canned messages is an idea that
> gets
> > floated around too but could be counterproductive for the vandalism case
> as
> > you point out.
> >
> > > There is a long-standing tool to search them at
> >
> >
>
https://sigma.toolforge.org/summary.py?name=Stuartyeates&search=re-revi…
> > In case you're looking for code to reuse.
> > Thanks! Glad to see this tool exists!
> >
> > For completeness, it was also pointed out to me that Wattenberg, Viégas,
> > and Hollenbach's 2007 paper "Visualizing Activity on Wikipedia with
> > Chromograms" makes heavy use of edit summaries and provides some insight
> > into their usage:
> >
https://link.springer.com/content/pdf/10.1007/978-3-540-74800-7_23.pdf
> >
> > Best,
> > Isaac
> >
> > On Tue, Aug 3, 2021 at 3:48 PM Stuart A. Yeates <syeates(a)gmail.com>
> wrote:
> >
> > > There is a long-standing tool to search them at
> > >
> > >
> > >
> >
>
https://sigma.toolforge.org/summary.py?name=Stuartyeates&search=re-revi…
> > >
> > > In case you're looking for code to reuse.
> > >
> > > cheers
> > > stuart
> > > --
> > > ...let us be heard from red core to black sky
> > >
> > > On Wed, 4 Aug 2021 at 05:38, WereSpielChequers
> > > <werespielchequers(a)gmail.com> wrote:
> > > >
> > > > Dear Isaac,
> > > >
> > > > I'm not aware of any research on this. But there are a couple of
> common
> > > > assumptions that you could check as part of any research.
> > > >
> > > >
> > > > 1. One of the reasons why any suggestion that we make edit
> summaries
> > > > compulsory is that as long as they are optional, blank edit
> > summaries
> > > are a
> > > > great way to identify vandals.
> > > > 2. There is also a certain amount of "sneaky vandalism"
denoted by
> > > edits
> > > > that get reverted or reverted and the perpetrators get warned for
> > > vandalism
> > > > or blocked as a "vandalism only account"
> > > > 3. Though we admins have the technology to blank people's
edit
> > > summaries
> > > > it is very rarely used
> > > >
> > > >
> > > >
> > > >
> > > > Regards
> > > > Jonathan
> > > >
> > > > On Tue, 3 Aug 2021 at 16:20, Isaac Johnson
<isaac(a)wikimedia.org>
> > wrote:
> > > >
> > > > > Does anyone know of any research or statistics around edit
summary
> > > > > <https://en.wikipedia.org/wiki/Help:Edit_summary> usage
on
> > Wikipedia?
> > > All
> > > > > I
> > > > > could find in a quick scan was some statistics from 2010 (
> > > > >
https://meta.wikimedia.org/wiki/Usage_of_edit_summary_on_Wikipedia
> ).
> > > I'm
> > > > > curious if anyone has more updated statistics, or, even better:
a
> > more
> > > > > thorough analysis of how edit summaries are used by editors --
i.e.
> > how
> > > > > complete they are, to what degree they represent the
"what" vs. the
> > > "why",
> > > > > how often they are misleading, etc.
> > > > >
> > > > > Best,
> > > > > Isaac
> > > > >
> > > > > --
> > > > > Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia
> > > Foundation
> > > > > _______________________________________________
> > > > > Wiki-research-l mailing list --
> wiki-research-l(a)lists.wikimedia.org
> > > > > To unsubscribe send an email to
> > > wiki-research-l-leave(a)lists.wikimedia.org
> > > > >
> > > > _______________________________________________
> > > > Wiki-research-l mailing list -- wiki-research-l(a)lists.wikimedia.org
> > > > To unsubscribe send an email to
> > > wiki-research-l-leave(a)lists.wikimedia.org
> > > _______________________________________________
> > > Wiki-research-l mailing list -- wiki-research-l(a)lists.wikimedia.org
> > > To unsubscribe send an email to
> > wiki-research-l-leave(a)lists.wikimedia.org
> > >
> >
> >
> > --
> > Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation
> > _______________________________________________
> > Wiki-research-l mailing list -- wiki-research-l(a)lists.wikimedia.org
> > To unsubscribe send an email to
> wiki-research-l-leave(a)lists.wikimedia.org
> >
> _______________________________________________
> Wiki-research-l mailing list -- wiki-research-l(a)lists.wikimedia.org
> To unsubscribe send an email to wiki-research-l-leave(a)lists.wikimedia.org
>
--
Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation