Re: [WikimediaMobile] Preliminary WikiGrok response quality in stable

6 Jan 2015

      So cool.  It's always pleasing to see positive results tests like these.
Seems like WikiGrok Version B wins round 1.
On Mon, Jan 5, 2015 at 11:47 AM, Joaquin Oltra Hernandez <
jhernandez@wikimedia.org> wrote:
...
Very cool results. Seems like showing the same question to a bunch of
users and grabbing the most popular answer as the correct one will work on
most of the cases.
On Jan 5, 2015 8:08 PM, "Florian Schmidt" <
florian.schmidt.welzow@t-online.de> wrote:
...
Awesome! Can’t wait for it to be „always-on“ :)
*Von:* mobile-l-bounces@lists.wikimedia.org [
mailto:mobile-l-bounces@lists.wikimedia.org
mobile-l-bounces@lists.wikimedia.org]* Im Auftrag von* Maryana Pinchuk
*Gesendet:* Montag, 5. Januar 2015 19:48
*An:* Leila Zia; Dario Taraborelli; mobile-l
*Betreff:* [WikimediaMobile] Preliminary WikiGrok response quality in
stable
If you're like me, you've probably been breathlessly awaiting the results
of the first WikiGrok stable A/B test to see if the responses we're getting
are good, bad, or ugly :) Well, good news! I did some hand-coding of the
results (a sample of about 300 responses from the ~1,200 we got during the
test) and have some interesting preliminary findings to share. Caveat: this
is not science, just a quick check of WikiGrok's pulse. Leila from
Analytics is helping us analyze this and other WikiGrok test data and will
have a more thorough write-up of the results soon :)
As a reminder, this test ran for a week in December in stable for logged
in users only on English Wikipedia. We tested two versions of the UX (a
simple "yes/no/maybe" interface and a slightly more complex tagging one),
and we asked questions about biographies (actors and writers) and music
albums (live or studio albums). The responses were not yet sent to
Wikidata; the infrastructure to do that is currently in development.

The tl;dr is that the quality of the responses is pretty high!* The

overall rate of correct responses for the sample I looked at was** 80%*.

Also, *users with no edits and users with 1 or more edits had similar

quality responses* (in fact, the 0 edit count users gave slightly higher
quality responses). So even total newbs are capable of grokking :)

Lastly, while we didn't see any differences in engagement or conversion

(the rate at which users started and finished the WikiGrok process) between
the two versions, there was a difference in quality –* Version B
(tagging) produced a noticeably higher quality response rate (95%)*.
More detailed breakdown of quality below, including by individual answer
(fun fact that is sure to make Sam Smith sad: nobody seems to have any clue
what a live album is!). Now let's see if these trends hold for logged out
users, too :) Our first test for all users (logged in and logged out) is
slated for later this month.

*User classes*
Users with 0 edits – 85%
Users with 1 or more edits – 80%
*Versions*
Version A – 68%
Version B – 95%
*Question types*
"Is this person an author?" – 72%
"Is this a film actor?" – 90%
"Is this a television actor?" – 85%
"Is this a live album?" – 50% :(
"Is this a studio album?" – 64%
--
Maryana Pinchuk
Product Manager, Wikimedia Foundation
*wikimediafoundation.org* http://wikimediafoundation.org

Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l

Mobile-l mailing list
Mobile-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l
-- 
Rob Moen
Wikimedia Foundation
rmoen@wikimedia.org

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [WikimediaMobile] Preliminary WikiGrok response quality in stable