AFT5: what practical benefits has it had?

List overview All Threads
Download

newer

older

An interesting governance model:...

Re: [Wikimedia-l]...

ENWP Pine

6 Sep 2012 6 Sep '12

5:33 p.m.

Forwarding questions from Research-l with permission, with the hope that these will spark discussion here on Wikimedia-l.

RJensen: "Comments: I have not seen any editor make actual use of the Article Feedback tool -- are there examples? Yes Wikipedians are very proud of their vast half-billion-person audience. However they do not ask "what features are most useful for a high school student or teacher/ a university student/ etc""

Pine: This is a very interesting question. What have been the benefits of AFT5? I have seen complaints about spam and suppressible material being written in AFT5. What benefits has it had?

Thanks, Pine

Show replies by date

Dario Taraborelli

6 Sep 6 Sep

6:47 p.m.

(cross-posting my reply from wiki-research-l)

The complete reports on WMF research on AFT5 can be found here: http://meta.wikimedia.org/wiki/Research:Article_feedback

The tool is currently deployed on a random 10% sample of English Wikipedia articles so it's not surprising most readers/editors don't see it very often. We are currently collecting about 4K unique feedback messages per day: http://toolserver.org/~dartar/aft5

As for the quality of feedback – as judged by community members and readers – we have some preliminary usage data coming from the FeedbackPage: http://toolserver.org/~dartar/fp/ as well as results based on blind assessment by Wikipedians that we ran during the early stages of AFT5 research (see the "Quality assessment" sections in the research reports above).

We will be publishing shortly an update on FeedbackPage data, but as the feature is not rolled out on the entire project and not many editors or readers know how to find the FeedbackPage (i.e. the only place where comments can be filtered, flagged and moderated), these results should not be taken as conclusive.

A full roll out of AFT5 on the entire English Wikipedia is scheduled for Q4 2012.

Dario

On Sep 6, 2012, at 1:33 PM, ENWP Pine wrote:

...

Forwarding questions from Research-l with permission, with the hope that these will spark discussion here on Wikimedia-l.

RJensen: "Comments: I have not seen any editor make actual use of the Article Feedback tool -- are there examples? Yes Wikipedians are very proud of their vast half-billion-person audience. However they do not ask "what features are most useful for a high school student or teacher/ a university student/ etc""

Pine: This is a very interesting question. What have been the benefits of AFT5? I have seen complaints about spam and suppressible material being written in AFT5. What benefits has it had?

Thanks, Pine _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l

MZMcBride

22 Sep 22 Sep

10:20 p.m.

Dario Taraborelli wrote:

...

A full roll out of AFT5 on the entire English Wikipedia is scheduled for Q4 2012.

Hi.

Do you have a link to on-wiki consensus for this idea? I was under the impression that the ArticleFeedback tool was developed as an experiment. It needs a discussion and on-wiki consensus before being widely deployed, right? Has that happened?

MZMcBride

10 Nov 10 Nov

1:04 p.m.

MZMcBride wrote:

...

Dario Taraborelli wrote:

...
A full roll out of AFT5 on the entire English Wikipedia is scheduled for Q4 2012.

Do you have a link to on-wiki consensus for this idea? I was under the impression that the ArticleFeedback tool was developed as an experiment. It needs a discussion and on-wiki consensus before being widely deployed, right? Has that happened?

I asked these questions in September. As far as I can tell, there was no response to them.

A number of long-time users have concerns about this tool, particularly as it is quite often used to libel subjects of articles.

Before adding to volunteers' workloads (to moderate and respond to these comments), I would think it unquestionably requires the consent of the volunteers, right?

Dario?

MZMcBride

Dario Taraborelli

19 Nov 19 Nov

11:04 p.m.

Hey – apologies for the late response (I've just returned from my annual leave).

The AFT team is currently reviewing a number of options to address the moderation workload issue, which is definitely an important one. So to answer your question: you can safely consider the feature as "experimental", further decisions will be deferred until we've tested/discussed different approaches to moderation.

As far as data is concerned, we're going to publish additional analyses/datasets on feedback volume and moderation activity for the current (10%) sample on top of those already available via the dashboards, they'll be announced on the lists and on meta:R:AFT.

Dario

On Nov 10, 2012, at 8:04 AM, MZMcBride z@mzmcbride.com wrote:

...

MZMcBride wrote:

...
Dario Taraborelli wrote:

...
A full roll out of AFT5 on the entire English Wikipedia is scheduled for Q4 2012.

Do you have a link to on-wiki consensus for this idea? I was under the impression that the ArticleFeedback tool was developed as an experiment. It needs a discussion and on-wiki consensus before being widely deployed, right? Has that happened?

I asked these questions in September. As far as I can tell, there was no response to them.

A number of long-time users have concerns about this tool, particularly as it is quite often used to libel subjects of articles.

Before adding to volunteers' workloads (to moderate and respond to these comments), I would think it unquestionably requires the consent of the volunteers, right?

Dario?

MZMcBride

Federico Leva (Nemo)

12 Oct 12 Oct

6:11 a.m.

Dario Taraborelli, 06/09/2012 23:47:

...

The complete reports on WMF research on AFT5 can be found here: http://meta.wikimedia.org/wiki/Research:Article_feedback

The tool is currently deployed on a random 10% sample of English Wikipedia articles so it's not surprising most readers/editors don't see it very often. We are currently collecting about 4K unique feedback messages per day: http://toolserver.org/~dartar/aft5

As for the quality of feedback – as judged by community members and readers – we have some preliminary usage data coming from the FeedbackPage: http://toolserver.org/~dartar/fp/ as well as results based on blind assessment by Wikipedians that we ran during the early stages of AFT5 research (see the "Quality assessment" sections in the research reports above).

Graphs are empty for me there, is it just me?

...

We will be publishing shortly an update on FeedbackPage data, but as the feature is not rolled out on the entire project and not many editors or readers know how to find the FeedbackPage (i.e. the only place where comments can be filtered, flagged and moderated), these results should not be taken as conclusive.

A full roll out of AFT5 on the entire English Wikipedia is scheduled for Q4 2012.

Nemo

Amir E. Aharoni

6:33 a.m.

2012/10/12 Federico Leva (Nemo) nemowiki@gmail.com:

...

Dario Taraborelli, 06/09/2012 23:47:

...
The complete reports on WMF research on AFT5 can be found here: http://meta.wikimedia.org/wiki/Research:Article_feedback

The tool is currently deployed on a random 10% sample of English Wikipedia articles so it's not surprising most readers/editors don't see it very often. We are currently collecting about 4K unique feedback messages per day: http://toolserver.org/~dartar/aft5

As for the quality of feedback – as judged by community members and readers – we have some preliminary usage data coming from the FeedbackPage: http://toolserver.org/~dartar/fp/ as well as results based on blind assessment by Wikipedians that we ran during the early stages of AFT5 research (see the "Quality assessment" sections in the research reports above).

Graphs are empty for me there, is it just me?

No, for me too. They start being full at "Daily feedback volume (option 1)". And the graphs on the FeedbackPage usage dashboard are completely empty.

-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com ‪“We're living in pieces, I want to live in peace.” – T. Moore‬

Dario Taraborelli

10:41 a.m.

On Oct 12, 2012, at 2:11, "Federico Leva (Nemo)" nemowiki@gmail.com wrote:

...

...
we have some preliminary usage data coming from the FeedbackPage: http://toolserver.org/~dartar/fp/

Graphs are empty for me there, is it just me?

We have a temporary hardware issue affecting the slave DB from which this data is pulled. Ops is on it and I hope to have it back soon.

Dario

Federico Leva (Nemo)

14 Oct 14 Oct

8:01 a.m.

Dario Taraborelli, 12/10/2012 15:41:

...

On Oct 12, 2012, at 2:11, "Federico Leva (Nemo)" nemowiki@gmail.com wrote:

...
...
we have some preliminary usage data coming from the FeedbackPage: http://toolserver.org/~dartar/fp/

Graphs are empty for me there, is it just me?

We have a temporary hardware issue affecting the slave DB from which this data is pulled. Ops is on it and I hope to have it back soon.

Thank you for enabling it again. I had read about the blind tests in https://meta.wikimedia.org/wiki/Research:Article_feedback/Quality_assessment before but I see some major changes in the graphs, which are a bit hard to understand. 1) In "Daily moderation actions (percentage)" there's a huge spike of helpful/unhelpful after C (July), did those flags even exist before? Or did helpfulness increase after wider usage according to the finding «the average page receives higher quality feedback than pages picked for their popularity/controversial topic»? (There's no change between 5 and 10 % though.) 2) "Unique daily articles with feedback moderated" shows a spike and then a stabilization, but I don't know what the graphs actually is about. For instance, can feedback be moderated per article ("feedback semi/full protection" or so) or only per item, etc. Do you know if moderation happens on the same articles and if stricter moderation increases helpfulness of feedback also on non-moderated articles?

Nemo

Oliver Keyes

8:33 a.m.

Thank you for enabling it again. I had read about the blind tests in < https://meta.wikimedia.org/wiki/Research:Article_feedback/Quality_assessment... before but I see some major changes in the graphs, which are a bit hard to understand.

...

In "Daily moderation actions (percentage)" there's a huge spike of

helpful/unhelpful after C (July), did those flags even exist before? Or did helpfulness increase after wider usage according to the finding «the average page receives higher quality feedback than pages picked for their popularity/controversial topic»? (There's no change between 5 and 10 % though.)

*They did; the spike is most probably caused by a deployment from 0.6 percent of articles to 5 percent of articles, with a resulting "ooh, shiny! Lets take a look" reaction.

2) "Unique daily articles with feedback moderated" shows a spike and then a

...

stabilization, but I don't know what the graphs actually is about. For instance, can feedback be moderated per article ("feedback semi/full protection" or so) or only per item, etc. Do you know if moderation happens on the same articles and if stricter moderation increases helpfulness of feedback also on non-moderated articles?

*So, I *believe* it means "the number of distinct articles which have had feedback moderated that day", regardless of whether people use the article-specific page or the centralised page, but I'm not sure - some clarification from Dario would be awesome :). Ditto your other questions, particularly on the distribution of articles.

...

Nemo

______________________________**_________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.**org Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/**mailman/listinfo/wikimedia-l https://lists.wikimedia.org/mailman/listinfo/wikimedia-l

-- Oliver Keyes Community Liaison, Product Development Wikimedia Foundation

phoebe ayers

4:19 p.m.

On Sun, Oct 14, 2012 at 4:33 AM, Oliver Keyes okeyes@wikimedia.org wrote:

...

Thank you for enabling it again. I had read about the blind tests in < https://meta.wikimedia.org/wiki/Research:Article_feedback/Quality_assessment... before but I see some major changes in the graphs, which are a bit hard to understand.

...

In "Daily moderation actions (percentage)" there's a huge spike of

helpful/unhelpful after C (July), did those flags even exist before? Or did helpfulness increase after wider usage according to the finding «the average page receives higher quality feedback than pages picked for their popularity/controversial topic»? (There's no change between 5 and 10 % though.)

*They did; the spike is most probably caused by a deployment from 0.6 percent of articles to 5 percent of articles, with a resulting "ooh, shiny! Lets take a look" reaction.

Indeed; I remember some (internal) announcements around this, which caused me and no doubt others to while away an evening just after deployment clicking helpful/unhelpful :)

Also, not to state the obvious, but 'helpful' feedback in and of itself doesn't mean the article changed for the better; I've marked plenty of feedback 'helpful' without doing anything further about it. Is there any data about rate of change of the articles since AFT was enabled? (probably pretty hard to measure since articles are individually fluid at much different rates, depending on topic, and you'd have to control for the baseline likeliness of random bursts of editing somehow).

-- phoebe

-- * I use this address for lists; send personal messages to phoebe.ayers <at> gmail.com *

Thomas Dalton

4:31 p.m.

On 14 October 2012 20:19, phoebe ayers phoebe.wiki@gmail.com wrote:

...

Indeed; I remember some (internal) announcements around this, which caused me and no doubt others to while away an evening just after deployment clicking helpful/unhelpful :)

I didn't spend an entire evening on it, but I can certainly say those announcements prompted me to go and moderate feedback which I then didn't sustain. If lots of people did the same as us, that would certainly give a spike in the graphs.

...

Also, not to state the obvious, but 'helpful' feedback in and of itself doesn't mean the article changed for the better; I've marked plenty of feedback 'helpful' without doing anything further about it. Is there any data about rate of change of the articles since AFT was enabled? (probably pretty hard to measure since articles are individually fluid at much different rates, depending on topic, and you'd have to control for the baseline likeliness of random bursts of editing somehow).

That is a very important point. The goal of the AFT is not to collect feedback, but to improve articles (either by people acting on the feedback or, perhaps more interestingly, but people giving feedback and then being prompted to edit themselves).

Collecting statistics on the feedback itself is a good first stage in the experimentation process, but it does need to be followed up be statistics on whether the ultimate goal is being achieved or not (based on anecdotal evidence, I suspect it isn't at this point, but it is early days).

Phil Nash

4:38 p.m.

I found it mostly useless. Not only could I mark the feedback resolved, which should not be possible for a banned user (!), but the feedback was either gibberish/abuse or unhelpful in the sense of (1) the material requested was already in the article, or a linked article, or (2) the complaint was too unspecific to be actionable. Since I have about 4700 articles watchlisted, I feel this is a representative sample, and the result is only to be expected from "an encyclopedia that anyone can edit". Does this feature justify its cost? No.

----- Original Message ----- From: "phoebe ayers" phoebe.wiki@gmail.com To: "Wikimedia Mailing List" wikimedia-l@lists.wikimedia.org Sent: Sunday, October 14, 2012 8:19 PM Subject: Re: [Wikimedia-l] AFT5: what practical benefits has it had?

On Sun, Oct 14, 2012 at 4:33 AM, Oliver Keyes okeyes@wikimedia.org wrote:

...

Thank you for enabling it again. I had read about the blind tests in < https://meta.wikimedia.org/wiki/Research:Article_feedback/Quality_assessment... before but I see some major changes in the graphs, which are a bit hard to understand.

...

In "Daily moderation actions (percentage)" there's a huge spike of

helpful/unhelpful after C (July), did those flags even exist before? Or did helpfulness increase after wider usage according to the finding «the average page receives higher quality feedback than pages picked for their popularity/controversial topic»? (There's no change between 5 and 10 % though.)

*They did; the spike is most probably caused by a deployment from 0.6 percent of articles to 5 percent of articles, with a resulting "ooh, shiny! Lets take a look" reaction.

Indeed; I remember some (internal) announcements around this, which caused me and no doubt others to while away an evening just after deployment clicking helpful/unhelpful :)

-- phoebe

-- * I use this address for lists; send personal messages to phoebe.ayers <at> gmail.com * _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l

Federico Leva (Nemo)

4:55 p.m.

phoebe ayers, 14/10/2012 21:19:

...

Also, not to state the obvious, but 'helpful' feedback in and of itself doesn't mean the article changed for the better; I've marked plenty of feedback 'helpful' without doing anything further about it. Is there any data about rate of change of the articles since AFT was enabled? (probably pretty hard to measure since articles are individually fluid at much different rates, depending on topic, and you'd have to control for the baseline likeliness of random bursts of editing somehow).

This was the original aim of AFT, to monitor the Public Policy initiative effects, so it's definitely possible, but I think they're mostly doing research about the tool itself now?

Nemo

Dario Taraborelli

15 Oct 15 Oct

9:58 p.m.

...

Thank you for enabling it again. I had read about the blind tests in https://meta.wikimedia.org/wiki/Research:Article_feedback/Quality_assessment before but I see some major changes in the graphs, which are a bit hard to understand.

In "Daily moderation actions (percentage)" there's a huge spike of helpful/unhelpful after C (July), did those flags even exist before? Or did helpfulness increase after wider usage according to the finding «the average page receives higher quality feedback than pages picked for their popularity/controversial topic»? (There's no change between 5 and 10 % though.)

*They did; the spike is most probably caused by a deployment from 0.6 percent of articles to 5 percent of articles, with a resulting "ooh, shiny! Lets take a look" reaction.

the spike actually follows the 5% deployment combined with the CentralNotice announcement (see annotation D in the first plot), the latter is almost certainly what caused the spike

...

"Unique daily articles with feedback moderated" shows a spike and then a stabilization, but I don't know what the graphs actually is about. For instance, can feedback be moderated per article ("feedback semi/full protection" or so) or only per item, etc. Do you know if moderation happens on the same articles and if stricter moderation increases helpfulness of feedback also on non-moderated articles?

*So, I believe it means "the number of distinct articles which have had feedback moderated that day", regardless of whether people use the article-specific page or the centralised page, but I'm not sure - some clarification from Dario would be awesome :). Ditto your other questions, particularly on the distribution of articles.

what Oliver said, we are not keeping track of the source of moderation activity at the moment but I agree it would be a very important piece of data to analyze. After consulting with Fabrice, I've opened this ticket on bugzilla so we can assess the effort needed to implement this via the logging table: https://bugzilla.wikimedia.org/show_bug.cgi?id=41061

Dario

4397

Age (days ago)

4472

Last active (days ago)

wikimedia-l@lists.wikimedia.org

14 comments

9 participants

tags (0)

participants (9)

Amir E. Aharoni
Dario Taraborelli
ENWP Pine
Federico Leva (Nemo)
MZMcBride
Oliver Keyes
Phil Nash
phoebe ayers
Thomas Dalton