[Wikimedia-l] AFT5: what practical benefits has it had?

Tue Oct 16 16:18:22 UTC 2012

Hi Phoebe,

Thanks for your good questions and insights about of Article Feedback 5.

Dario and Oliver have answered a couple of your questions below, but I wanted to give you a quick overview on where we are with this new editor engagement tool. 

We are now collecting fresh data to track how many readers who post feedback go on to make productive edits, which is one of our key objectives for this project. We are also analyzing moderation data to evaluate the usefulness of the feedback, which is another important goal, though harder to track accurately. Lastly, we are enhancing our automated filters to reduce the workload for editors, as well as re-factoring our code to make this extension scale better. 

We plan to have more information to share in a couple weeks, once all the new data have been collected and analyzed, and new tools deployed. At that point, we will share our findings widely, so we can all have more informed discussions on how to best use this engagement tool. In the meantime, I've added a couple clarifications inline below.

Thanks again for your interest, and I look forward to continuing this discussion very soon!

Fabrice

__________________________________

> From: Dario Taraborelli <dtaraborelli at wikimedia.org>
> Subject: Re: [Wikimedia-l] AFT5: what practical benefits has it had?
> Date: October 15, 2012 5:58:04 PM PDT
> To: Oliver Keyes <okeyes at wikimedia.org>, "phoebe ayers" <phoebe.wiki at gmail.com>
> Cc: Wikimedia Mailing List <wikimedia-l at lists.wikimedia.org>, Fabrice Florin <fflorin at wikimedia.org>
> 
>> Thank you for enabling it again. I had read about the blind tests in <https://meta.wikimedia.org/wiki/Research:Article_feedback/Quality_assessment> before but I see some major changes in the graphs, which are a bit hard to understand.
>> 1) In "Daily moderation actions (percentage)" there's a huge spike of helpful/unhelpful after C (July), did those flags even exist before? Or did helpfulness increase after wider usage according to the finding «the average page receives higher quality feedback than pages picked for their popularity/controversial topic»? (There's no change between 5 and 10 % though.)
>> *They did; the spike is most probably caused by a deployment from 0.6 percent of articles to 5 percent of articles, with a resulting "ooh, shiny! Lets take a look" reaction.
> 
> the spike actually follows the 5% deployment combined with the CentralNotice announcement (see annotation D in the first plot), the latter is almost certainly what caused the spike

Also note that up until September 6, users were able to moderate their own feedback, which caused a slight increase in the number of 'helpful ratings on the feedback page dashboard ( http://toolserver.org/~dartar/fp/ ). So we really should be paying more attention to the data collected after this date, even though the overall pattern doesn't appear significantly different from earlier this summer.

>> 
>> 2) "Unique daily articles with feedback moderated" shows a spike and then a stabilization, but I don't know what the graphs actually is about. For instance, can feedback be moderated per article ("feedback semi/full protection" or so) or only per item, etc.

We now have a feature that allows administrators to disable the feedback form for controversial articles, as part of the 'protect' tool. However, it was recently deployed and may not be in wide use yet.

>> Do you know if moderation happens on the same articles and if stricter moderation increases helpfulness of feedback also on non-moderated articles?
>>  
>> *So, I believe it means "the number of distinct articles which have had feedback moderated that day", regardless of whether people use the article-specific page or the centralised page, but I'm not sure - some clarification from Dario would be awesome :). Ditto your other questions, particularly on the distribution of articles. 
> 
> what Oliver said, we are not keeping track of the source of moderation activity at the moment but I agree it would be a very important piece of data to analyze. After consulting with Fabrice, I've opened this ticket on bugzilla so we can assess the effort needed to implement this via the logging table: https://bugzilla.wikimedia.org/show_bug.cgi?id=41061
> 
> Dario

> Also, not to state the obvious, but 'helpful' feedback in and of
> itself doesn't mean the article changed for the better; I've marked
> plenty of feedback 'helpful' without doing anything further about it.

Yes, Phoebe, you are absolutely right. Marking an article as helpful is a good way to bring up a good suggestion to the attention of other editors, but it does not mean that the comment has been used to edit the article. However, it makes it easier for editors to find comments that can help them improve the article. 

Over the two months following wider deployment, about 35% of the feedback was found useful by moderators. This finding seems consistent with our earlier estimate of usefulness, based on hand-coded feedback evaluations.

> Is there any data about rate of change of the articles since AFT was
> enabled? (probably pretty hard to measure since articles are
> individually fluid at much different rates, depending on topic, and
> you'd have to control for the baseline likeliness of random bursts of
> editing somehow).
> 

Yes. measuring the rate of change in the articles is pretty hard to measure, as you point out -- particularly changes in their quality.

But I would like to remind us that a primary objective for this project is to engage readers to contribute productively to Wikipedia, which is much easier to measure. I look forward to reviewing that data together in a couple weeks.

__________________________________

Fabrice Florin
Product Manager, Editor Engagement
Wikimedia Foundation
fflorin at wikimedia.org

http://en.wikipedia.org/wiki/User:Fabrice_Florin_(WMF)