[WikimediaMobile] Browse Hypothesis Results

11 Oct 2015


      Hi Team,
I just wanted to update you on the results of something we internally
referred to as the '*browse' *prototype.
TLDR: as implemented the mobile 'browse by category' test did not drive
significant engagement.  In fact, as implemented, it seemed inferior to
blue links.  However, we started with a very rough and low-impact
prototype, so a few tweaks would give us more definitive results.
Here is the doc from which I am pasting from below:
https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtizy...
Questions/comments welcome!
Best,
J
Browse Prototype Results

Intro
https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtizyFQ4/edit#heading=h.6s40inyan02p
Process
https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtizyFQ4/edit#heading=h.d5x661n72t7d
Results
https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtizyFQ4/edit#heading=h.naqxa4etwhl4
Blue links in general
https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtizyFQ4/edit#heading=h.8nn07h675j3o
Category tags
https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtizyFQ4/edit#heading=h.gagragojxpiz
Conclusion and Next Steps
https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtizyFQ4/edit#heading=h.z3p82tg8enr
Process
https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtizyFQ4/edit#heading=h.ocqtfqhf8n0t
Do people want to browse by categories?
https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtizyFQ4/edit#heading=h.9ksw2zvt8q19

Intro
As outlined in this doc
https://docs.google.com/presentation/d/1ZssE8G0P5WVg8XmkBTi5G3n4OdLHPFGWZDZFW5_DSS0/edit?usp=sharing,
the concept is a tag that allows readers to navigate WP via categories that
are meaningful and populated in order of 'significance' (as determined by
user input).  The hypothesis:
-
users will want to navigate by category if there are fewer, more
   meaningful categories per page and those category pages showed the most
   ‘notable’ members first.
Again, see the full doc
https://docs.google.com/presentation/d/1ZssE8G0P5WVg8XmkBTi5G3n4OdLHPFGWZDZFW5_DSS0/edit?usp=sharing
to understand the premise.
Process
The first step was to validate: do users want to navigate via category?  So
we built a very lightweight prototype on mobile web, en wikipedia (stable,
not beta) using hardcoded config variables, in the following categories ( ~4000
pages).  Here we did not look into sub-categories with one exception (see
T94732 https://phabricator.wikimedia.org/T94732 for details).  There was
also an error and 2 of the categories did not have tags implemented (struck
through, below)
Category
Pagecount
NBA All Stars
400
American Politicians
818
Object-Oriented Programming Languages
164
European States
24
American Female Pop Singers
326
American drama television series
1048
Modern Painters
983
Landmarks in San Francisco, California
270
Here is how it appeared on the Alcatraz page
When the user clicked the tag, they were taken to a gather-like collection
based on manually estimated relevance
(sorry cropped shot)
The category pages were designed to show the most relevant (as deemed by
me) to the broadest audience, first. Here is the ordering:
https://docs.google.com/spreadsheets/d/12xLXQsH1zcg6E8lDuSonumZNdBvfaBuHOS1a...
This was intended to lie in contrast with our current category pages, which
are alphabetical and not really intended for human browsing:
https://en.wikipedia.org/wiki/Category:American_male_film_actors
We primarily measured a few things:
-
when a tag was seen by a user
   -
when a tag was clicked on by a user
   -
when a page in the new ‘category view’ was clicked on by a user
As a side effort, I looked to see if overall referrals from pages with tags
went up--this was a timed intervention rather than an a/b test and given
the click-thru on the tags, the impact would have been negligible anyway.
This was confirmed by some very noisy results.
Results
Blue links in general
One benefit of the side study mentioned in the previous paragraph is that I
was able to generate a table that looked at the pages in question before we
started the test that shows a ratio of total pageviews/pageviews referred
by a page (estimate of how many links were opened from that page).  Though
it is literally just for 0-1 GMT, 6/29/15, now  that we have the pageview
hourly table, a more robust analysis can tell us how categories differ in
this regard:
Category
links clicked
#pvs
clicks/pvs
Category:20th-centuryAmericanpoliticians
761
1243
61%
Category:Americandramatelevisionseries
5981
8844
68%
Category:Americanfemalepopsingers
2502
4280
58%
Category:LandmarksinSanFrancisco,
104
287
36%
Category:Modernpainters
136
369
37%
Category:NationalBasketballAssociationAll-Stars
1908
3341
57%
Category:Object-orientedprogramminglanguages
48
181
27%
Category:WesternEurope
657
1221
54%
Grand Total
12099
19766
50%
You can see here that for pages in the category  ‘Landmarks in San
Francisco’, if there are 10 pageviews, 5.4 clicks to other pages are
generated on average.
I do not have the original queries for this handy, but can dig them up if
you’re really interested.
Category tags
Full data and queries here:
https://docs.google.com/a/wikimedia.org/spreadsheets/d/1vD3DopxGyeh9FQsuTQDM...
The tags themselves generated an average click-through rate of .18%.  Given
the overall click thru rate on the pages estimated above ~50%, this single
tag is not driving anything significant.  Furthermore, given Leila and
Bob’s paper suggest that this is performing no better than a mid-article
click--given the mobile web sections are collapsed, I would need to
understand more about their method to know just how to interpret their
results against our mobile-web only implementation.  Furthermore, our click
through rate used the number of times the tag appeared on screen as the
denominator, whereas their research looked at overall pageviews.
This being noted, the tag was implemented to be as obscure as possible to
establish a baseline.  Furthermore, any feature like this would probably be
different in the following ways:
-
each page would be in 1-4 tag groups (as opposed to just 1)
   -
each page would be tagged, creating the expectation on the part of the
   user that this was something to look for
   -
presumably the categories could be implemented as a menu item as opposed
   to being buried at the bottom of the page (and competing with features like
   read more.
   -
Using the learnings from ‘read more’ tags with images or buttons would
   likely fare much better.
The follow graph shows:
-
number of impressions on the right axis
   -
click-thru-rate on the left-axis.
When you look at click through rates on the ‘category’ pages themselves,
you see that they average at 41% (Chart below)  Meaning that for every 10
times a user visited a category page, there were 4.1 clicks to one of those
pages as a result.
Here is the same broken up by category:
Each ‘category’ page here had at least 400 visits, and you can see that the
interest seems to vary dramatically across categories.  It is worth noting
that the top three categories here are the ones with the fewest entities.
Each list, however, was capped at ~50 articles, so it is unclear what might
be causing this effect, if it is real.
As mentioned above, the average article page has an overall click rate of
50%. So this page of categories did not have the click-through rate that a
page has.  However, this page had summaries of each of the pages, so it
could be that users were generating value beyond what a blue link would
provide.  A live-user test of Gather collections, from whom this format was
borrowed, suggested that the format used up too much vertical space on each
article and was hard to flip through.  Shortening the amount of text or
image space might be something to try to make the page more useful
Conclusion and Next StepsProcess
-
This was the first time I am aware of that we ran a live prototype and
   learn something without building a scalable solution. Win
   -
Developer time was estimated at 1 FTE for 2 weeks (by pheudx), but the
   chronological time for pushing to stable took a quarter. Room for
   improvement
   -
The time to analysis was almost 2 quarters, due to a lack of data
   analysis support (I ran the initial analysis within 2 weeks of launch,
   during paternity leave, but was unable to go back and get it ready to
   distribute for 3 months).  Room for improvement--possibly solved by
   additional Data Analyst.
This experiment was not designed to answer questions definitively in one
round, but with the understanding that multiple iterations would allow us
to fully answer our questions.
The long turn-around time, particularly around analysis and communication,
meant that tweaking a variable to test the conclusions or the new questions
that arosee below will involve a whole lot more work and effort than if we
had been able to explore modifications within a few weeks of the initial
launch.
Do people want to browse by categories?
Category tags at the bottom of the mobile web page in a dull gray
background that lead to manually curated categories are not a killer
feature :)
I would be reluctant to say that this means users are not interested in
browsing by category, however.  For instance, it is likely that
-
users did not notice the tag, even if it appeared on screen
   -
users are accustomed to our current category tags on desktop and not
   interested in that experience
   -
users who did like the tag were unlikely to find another page that had
   it--there was no feedback mechanism by which the improved category page
   would drive additional tag interactions
   -
the browse experience created was not ideal
If we decide to pursue what is currently termed “cascade c: update ux”, I
would like to proceed with more tests in this arena, by altering the
appearance and position of the tags, and by improving the flow of the
‘category’ pages.  If we choose a different strategy, hopefully other teams
can build off of what was learned here.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[WikimediaMobile] Browse Hypothesis Results