Re: [Foundation-l] Article Feedback Tool 5 testing deployment

23 Dec 2011

Oliver, with regards to Geni's question and your response, this is what I understood
was the situation too: that the use of AFTv5 was on a small subset of articles to ensure
minimum disruption to the editing community whilst still being able to gain enough usage
data from readers to know whether it's working. Then iterate, improve, rollout to a
slightly larger set, repeat.... :-)

However, I'd like to contest the two reasons you've given for not turning off
AFTv4 in the mean time.

On 23/12/2011, at 3:49, Oliver Keyes &lt;okeyes(a)wikimedia.org&gt; wrote:

...
  Actually, we're trying to avoid turning off AFT4.
The reasoning is twofold.

 On a product development front, the AFT5 presence is for testing purposes,
 and for testing purposes only; it will be up for around 2-3 weeks so we can
 build a decent picture of the quantity and quality of feedback we're
 getting. While this process is going on, we want to maintain a pretty
 coherent interface for the readers to avoid confusion - and AFT4 is much
 closer to AFT5 than no form at all is. 
Are you saying that AFTv4 (the 'star rating' system) is being used as the
"control group" in this experiment? That is, if ONLY 0.3% of en.wp articles had
a feedback tool enabled, then they would receive different kinds of feedback because they
would look different to the vast majority if the encyclopedia. So you're trying to
minimize that difference by keeping it running on all the rest? If that's the case,
then surely you only need to run the "control" group at the same frequency as
the new tests rather than giving them disproportionate visibility.

On the other hand, what I think you're saying is that you want to preserve a consisten
user-experience during this period of testing AFTv5, so that we don't go from 100% of
v4, to 0.3% of v5 (with the rest having nothing), and then to 100% v5. If this is the case
I find it a bit worrying that the current version of the tool - which has always been
proposed as experimental - is now simply there as a placeholder awaiting improvement.
Surely if we know that we're not using the current version any more, we should take it
offline until the new one is ready. I would be very surprised if any members of the
general public would be confused because I would be surprised if any members of the
general public are actually looking for the feedback tool when they visit any articles.
Quite the contrary, I think the public WOULD be confused if we told them that the big box
at the bottom of every article is only there to "maintain a consistent
interface" and we're not actually using the ratings data that the big box is
asking them for.

I'm NOT making the argument that the AFT is inherently bad (in fact I'm really
looking forward to the v5 of the tool to see how much good-quality reader feedback we get,
which will hopefully enliven a lot of very quiet talkpages). I'm also NOT making the
argument that the WMF needs to seek some kind of mythical consensus for every single
software change or new feature test. What I AM saying is that now that v4 has been
depreciated it is both disingenuous to our readers and annoying to our community to have a
big box appear in such valuable real-estate simply because it will eventually be replaced
by a different, more useful, box. As you say, this replacement is "still quite some
time away" so it's a long time to leave a placeholder on the world's 5th most
visited website.

...

 On a data front, because the AFT5 presence is only for tests, and is only
 temporary (at least at the moment) there's no question of AFT4 feedback
 being ignored; the actual replacement of AFT4 with AFT5 on a wider scale is
 still quite some time away, and until that happens, I hope any AFT4
 feedback will be taken into account. 
What AFTv4 ratings has ever actually been used? I understand that data on HOW the tool has
been used is providing input into the design of v5, which is fair enough. But has anyone
actually been able to get useful data out of the ratings themselves - either on a
per-article or whole dataset basis? I think the software of the "article feedback
dashboard" is very interesting and potentially quite a useful system
http://en.wikipedia.org/wiki/Special:ArticleFeedback but, honestly, has any Wikipedian
ever been able to make practical use of that information to improve articles? Personally,
I make use of that tool to identify articles which are current targets for NPOV editing
[e.g. Justin Beiber is currently 6th highest rated article in the entire encyclopedia,
whilst Hanukkah is the 4th lowest], potentially useful information for vandal patrollers,
but hardly the intended use of the whole system. 

Sincerely,
-Liam

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Foundation-l] Article Feedback Tool 5 testing deployment