Hi Phoebe,
Thanks for your good questions and insights about of Article Feedback 5.
Dario and Oliver have answered a couple of your questions below, but I wanted to give you
a quick overview on where we are with this new editor engagement tool.
We are now collecting fresh data to track how many readers who post feedback go on to make
productive edits, which is one of our key objectives for this project. We are also
analyzing moderation data to evaluate the usefulness of the feedback, which is another
important goal, though harder to track accurately. Lastly, we are enhancing our automated
filters to reduce the workload for editors, as well as re-factoring our code to make this
extension scale better.
We plan to have more information to share in a couple weeks, once all the new data have
been collected and analyzed, and new tools deployed. At that point, we will share our
findings widely, so we can all have more informed discussions on how to best use this
engagement tool. In the meantime, I've added a couple clarifications inline below.
Thanks again for your interest, and I look forward to continuing this discussion very
soon!
Fabrice
__________________________________
From: Dario Taraborelli
<dtaraborelli(a)wikimedia.org>
Subject: Re: [Wikimedia-l] AFT5: what practical benefits has it had?
Date: October 15, 2012 5:58:04 PM PDT
To: Oliver Keyes <okeyes(a)wikimedia.org>rg>, "phoebe ayers"
<phoebe.wiki(a)gmail.com>
Cc: Wikimedia Mailing List <wikimedia-l(a)lists.wikimedia.org>rg>, Fabrice Florin
<fflorin(a)wikimedia.org>
Thank you for enabling it again. I had read about
the blind tests in
<https://meta.wikimedia.org/wiki/Research:Article_feedback/Quality_assessment>
before but I see some major changes in the graphs, which are a bit hard to understand.
1) In "Daily moderation actions (percentage)" there's a huge spike of
helpful/unhelpful after C (July), did those flags even exist before? Or did helpfulness
increase after wider usage according to the finding «the average page receives higher
quality feedback than pages picked for their popularity/controversial topic»? (There's
no change between 5 and 10 % though.)
*They did; the spike is most probably caused by a deployment from 0.6 percent of articles
to 5 percent of articles, with a resulting "ooh, shiny! Lets take a look"
reaction.
the spike actually follows the 5% deployment combined with the CentralNotice announcement
(see annotation D in the first plot), the latter is almost certainly what caused the spike
Also note that up until September 6, users were able to moderate their own feedback, which
caused a slight increase in the number of 'helpful ratings on the feedback page
dashboard (
http://toolserver.org/~dartar/fp/ ). So we really should be paying more
attention to the data collected after this date, even though the overall pattern
doesn't appear significantly different from earlier this summer.
>
> 2) "Unique daily articles with feedback moderated" shows a spike and then a
stabilization, but I don't know what the graphs actually is about. For instance, can
feedback be moderated per article ("feedback semi/full protection" or so) or
only per item, etc.
We now have a feature that allows administrators to disable the feedback form for
controversial articles, as part of the 'protect' tool. However, it was recently
deployed and may not be in wide use yet.
Do you know if
moderation happens on the same articles and if stricter moderation increases helpfulness
of feedback also on non-moderated articles?
*So, I believe it means "the number of distinct articles which have had feedback
moderated that day", regardless of whether people use the article-specific page or
the centralised page, but I'm not sure - some clarification from Dario would be
awesome :). Ditto your other questions, particularly on the distribution of articles.
what Oliver said, we are not keeping track of the source of moderation activity at the
moment but I agree it would be a very important piece of data to analyze. After consulting
with Fabrice, I've opened this ticket on bugzilla so we can assess the effort needed
to implement this via the logging table:
https://bugzilla.wikimedia.org/show_bug.cgi?id=41061
Dario
Also, not to state the obvious, but 'helpful'
feedback in and of
itself doesn't mean the article changed for the better; I've marked
plenty of feedback 'helpful' without doing anything further about it.
Yes, Phoebe, you are absolutely right. Marking an article as helpful is a good way to
bring up a good suggestion to the attention of other editors, but it does not mean that
the comment has been used to edit the article. However, it makes it easier for editors to
find comments that can help them improve the article.
Over the two months following wider deployment, about 35% of the feedback was found useful
by moderators. This finding seems consistent with our earlier estimate of usefulness,
based on hand-coded feedback evaluations.
Is there any data about rate of change of the articles
since AFT was
enabled? (probably pretty hard to measure since articles are
individually fluid at much different rates, depending on topic, and
you'd have to control for the baseline likeliness of random bursts of
editing somehow).
Yes. measuring the rate of change in the articles is pretty hard to measure, as you point
out -- particularly changes in their quality.
But I would like to remind us that a primary objective for this project is to engage
readers to contribute productively to Wikipedia, which is much easier to measure. I look
forward to reviewing that data together in a couple weeks.
__________________________________
Fabrice Florin
Product Manager, Editor Engagement
Wikimedia Foundation
fflorin(a)wikimedia.org
http://en.wikipedia.org/wiki/User:Fabrice_Florin_(WMF)