Forwarding, as this is affecting a lot of the "Mobile..." tables on
stat1003 (see the list in the linked incident documentation)
---------- Forwarded message ----------
From: Nuria Ruiz <nuria(a)wikimedia.org>
Date: Mon, Oct 26, 2015 at 8:50 AM
Subject: [Analytics] Eventlogging replication issues
To: "A mailing list for the Analytics Team at WMF and everybody who
has an interest in Wikipedia and analytics."
<analytics(a)lists.wikimedia.org>
Cc: Jaime Crespo <jcrespo(a)wikimedia.org>
Hello:
There are replication issues regarding Eventlogging data.
For some tables (see:
https://wikitech.wikimedia.org/wiki/Incident_documentation/20151022-EventLo…)
data has not been replicated since 2015-10-22.
All dashboards read from the slave rather than master so the data they
display is outdated until this issue is resolved. Ditto for any query
running on 1002.
You can follow the work of our DBA on this regard on the following ticket:
https://phabricator.wikimedia.org/T116599
Thanks,
Nuria
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
The Android team[0] is happy to announce a new Wikipedia Android app
beta release, v2.1.132-beta-2015-10-23[1]. In precisely 500 characters,
this revision contains the following features and fixes[2]:
* More consistent article language and simpler app language selection
dialog.
* Fixed occasional abnormal CPU usage
* Many UI enhancements and tweaks.
* Improved crash reporting.
* Better link preview UI and extract.
* Add preference to enable/disable link previews.
* Use higher resolution icons when possible.
* Display system notification after saving an image.
* Fix page thumbnail caching.
* Fix possible crash when sharing highlighted text.
* Fix possible crash when returning to Nearby screen.
This release saw volunteer contributions from Daniel Rey[3] and
Wikinaut[4]. Great work, devs!
You, too, can help make it better! Read our getting started guide[5]. We
can't wait for your contributions!
-The WMF Android team
[0] https://www.mediawiki.org/wiki/Wikimedia_Apps/Team#Android_App
[1] Rolling out at
https://play.google.com/store/apps/details?id=org.wikipedia.beta
[2] A complete list of changes is available at
http://git.wikimedia.org/commits/apps%2Fandroid%2Fwikipedia/beta%2F2.1.131-…
[3] https://twitter.com/DanReyLop
[4] https://twitter.com/Wikinaut, WikiMail
[5]
https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/Wikipedia_Android_app_ha…
Hi all,
here is the weekly look at our most important readership metrics.
As laid out earlier, the main purpose is to raise awareness about how these
are developing, call out the impact of any unusual events in the preceding
week, and facilitate thinking about core metrics in general. We are still
iterating on the presentation (e.g. to better take seasonality into
account, in particular including year-over-year comparisons) and eventually
want to create dashboards for those which are not already available in that
form already. Feedback and discussion welcome.
The most interesting news this time is the effect of the iOS Wikipedia app
getting featured in the App Store, see below. For those who haven’t been
following the discussion on this list (Mobile-l), I’d also like to
highlight that Jon Katz has recently posted
<https://lists.wikimedia.org/pipermail/mobile-l/2015-October/009839.html>
(to quote his TLDR) “Directional data [that] suggests that the project-wide
drop we see in pageviews is, in part, caused by shorter sessions on mobile
web compared to desktop (and a migration from desktop to mobile web)”.
Now to the usual data. (All numbers below are averages for October 12-18,
2015 unless otherwise noted.)
Pageviews
Total: 528 million/day (+0.9% from the previous week)
Context (April 2015-October 2015):
See also the Vital Signs dashboard
<https://vital-signs.wmflabs.org/#projects=all/metrics=Pageviews>
Desktop: 57.3%
Mobile web: 41.5%
Apps: 1.2%
Global North ratio: 76.9% of total pageviews (previous week: 77.0%)
Context (April 2015-October 2015):
Unique app users
Android: 1.16 million /day (+-0.0% from the previous week)
Context (January 2015-October 2015):
Not much news here.
iOS: 280k / day (+0.9% from the previous week)
Context (January 2015-September 2015):
The overall DAU number don’t yet show a noticeable impact of the app
getting featured (see below), we’ll see.
New app installations
Android: 37.9k/day (-4.0% from the previous week)
(Daily installs per device, from Google Play)
Context (July-October 2015):
The sustained rise in installs we’ve been seeing since around August 21
(see also the discussion in last week’s report about the possible
connection with the “Back to School” recommendation in the Play store) is
ebbing now, whereas the uninstall rate holds pretty much constant.
iOS: 6.69k/day (+48.0% from the previous week)
(download numbers from App Annie)
Context (July 24-Oct 21, 2015):
Last week, the Wikipedia app became featured on the iOS App Store homepage
(below the fold, as second item in a list called "Learn Your Facts"). The
effect on downloads is already clear - I’m including the last three days in
the above chart too; we’ll see how the impact on user numbers and pageviews
turns out. Josh from the IOS team is following this closely.
----
For reference, the queries and source links used are listed below (access
is needed for each). Most of the above charts are available on Commons, too
<https://commons.wikimedia.org/w/index.php?title=Special:ListFiles&offset=20…>
.
hive (wmf)> SELECT SUM(view_count)/7000000 AS avg_daily_views_millions FROM
wmf.projectview_hourly WHERE agent_type = 'user' AND
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) BETWEEN "2015-10-12"
AND "2015-10-18";
hive (wmf)> SELECT year, month, day,
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) as date,
sum(IF(access_method <> 'desktop', view_count, null)) AS mobileviews,
SUM(view_count) AS allviews FROM wmf.projectview_hourly WHERE year=2015 AND
agent_type = 'user' GROUP BY year, month, day ORDER BY year, month, day
LIMIT 1000;
hive (wmf)> SELECT access_method, SUM(view_count)/7 FROM
wmf.projectview_hourly WHERE agent_type = 'user' AND
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) BETWEEN "2015-10-12"
AND "2015-10-18" GROUP BY access_method;
hive (wmf)> SELECT SUM(IF (FIND_IN_SET(country_code,
'AD,AL,AT,AX,BA,BE,BG,CH,CY,CZ,DE,DK,EE,ES,FI,FO,FR,FX,GB,GG,GI,GL,GR,HR,HU,IE,IL,IM,IS,IT,JE,LI,LU,LV,MC,MD,ME,MK,MT,NL,NO,PL,PT,RO,RS,RU,SE,SI,SJ,SK,SM,TR,VA,AU,CA,HK,MO,NZ,JP,SG,KR,TW,US')
> 0, view_count, 0))/SUM(view_count) FROM wmf.projectview_hourly WHERE
agent_type = 'user' AND
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) BETWEEN "2015-10-12"
AND "2015-10-18";
hive (wmf)> SELECT year, month, day,
CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")), SUM(view_count) AS
all, SUM(IF (FIND_IN_SET(country_code,
'AD,AL,AT,AX,BA,BE,BG,CH,CY,CZ,DE,DK,EE,ES,FI,FO,FR,FX,GB,GG,GI,GL,GR,HR,HU,IE,IL,IM,IS,IT,JE,LI,LU,LV,MC,MD,ME,MK,MT,NL,NO,PL,PT,RO,RS,RU,SE,SI,SJ,SK,SM,TR,VA,AU,CA,HK,MO,NZ,JP,SG,KR,TW,US')
> 0, view_count, 0)) AS Global_North_views FROM wmf.projectview_hourly
WHERE year = 2015 AND agent_type='user' GROUP BY year, month, day ORDER BY
year, month, day LIMIT 1000;
hive (wmf)> SELECT SUM(IF(platform = 'Android',unique_count,0))/7 AS
avg_Android_DAU_last_week, SUM(IF(platform = 'iOS',unique_count,0))/7 AS
avg_iOS_DAU_last_week FROM wmf.mobile_apps_uniques_daily WHERE
CONCAT(year,LPAD(month,2,"0"),LPAD(day,2,"0")) BETWEEN 20151012 AND
20151018;
hive (wmf)> SELECT CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0"))
as date, unique_count AS Android_DAU FROM wmf.mobile_apps_uniques_daily
WHERE platform = 'Android';
hive (wmf)> SELECT CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0"))
as date, unique_count AS iOS_DAU FROM wmf.mobile_apps_uniques_daily WHERE
platform = 'iOS';
https://play.google.com/apps/publish/?dev_acc=02812522755211381933#StatsPla…https://www.appannie.com/dashboard/252257/item/324715238/downloads/?breakdo…
(select “Total”)
--
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
Hey all -- I've got a few older ios devices that I no longer test with
much, and am looking to clear them out of my inventory. If nobody working
on the Wikipedia app needs them I'll sell em on gazelle or something, so
speak up quick!
This is still supported & 64-bit, but a couple generations behind the
latest:
* iPhone 5s (retina 4")
These 32-bit devices run current iOS 9.1 but will probably be dropped by
ios 10:
* iPhone 4s (retina 3.5")
* iPod Touch 5th-gen (retina 4")
* iPad mini first-gen (non-retina 7.9")
And these are stuck on ios 6:
* iPhone 3Gs (non-retina 3.5")
* a second one!
I can bring them into the WMF office next week, or ship them to wherever.
-- brion
Hi Everyone,
I’m happy to announce that, as of today, Tilman Bayer has officially joined
the Reading product team as a Senior Analyst. He will increase our capacity
for understanding our users and the impact of our efforts through data
analysis. Tilman is responsible for:
-
generating our primary metrics and communicating them
-
consulting on, analyzing and communicating the results of
feature-specific instrumentation
-
driving best practices within the team with regard to analysis (aided by
his background in mathematics)
Tilman has been shadowing the team for about a quarter, and has ramped up
his involvement dramatically over the last month. During this period,
Tilman has moved quickly and is already owning the analysis of a
significant number of our ongoing projects.
Tilman’s weekly reading metrics report (example
<https://drive.google.com/a/wikimedia.org/file/d/0B-YjaiFnAja3eGdIUGlSV0oxTW…>)
is currently being sent out to the Mobile-l mailing list (which serves as
the Reading team’s primary public mailing list). We are still gathering
feedback from the subscribers of that list and once we have a sustainable
and stable format, we hope to share more widely.
Tilman has been working with us since July 2011, and has been a Wikipedian
since 2003. He was an analyst on the Communications team until Erik Moeller
asked him to join Product & Strategy at the beginning of this year. Tilman
will be moving out of his current role in Terry’s department where he is
right now wrapping up the current round of quarterly reviews and
preparation of the organization-wide quarterly report for Q1.
Best,
Jon
Hi Folks,
TLDR: Directional data suggests that the project-wide drop we see in
pageviews is, in part, caused by shorter sessions on mobile web compared to
desktop (and a migration from desktop to mobile web)
*Context:*
Danny and I took some time last week to try and understand the dramatic
drop in pageviews that we saw globally and in the global south just over
the last quarter. The numbers we quoted in the Q1 quarterly review
<https://commons.wikimedia.org/wiki/File:WMF_Reading_Quarterly_Review_Q1_201…>
last week were (pageviews across all projects, platforms and geographies):
- -12.4% Quarter over quarter
- -7% Year over year*
*The YoY data is sampled and there may be anomalies that force us to consider
an inaccuracy of up to +-5% difference possible (i.e. a YoY change from -2%
to -12%) (source: Tilman, Hive).
Here is the answer to one question we had: Is this drop due to fewer people
or fewer pages per user? I will preface this by saying that I erred on the
side of getting this out sooner than in making it incredibly replicable,
share-able. If you are interested in the specific queries or access to the
raw data, I will prep and send it out. Otherwise, curious to hear your
questions/concerns/suggestions.
*Details:*
Is this drop due to fewer people or fewer pages per user?
The answer here, is interesting and the impact is more significant than I
would have expected. On Desktop the pageviews per visit (as internal
referers/external+ unknown) is relatively constant. On Mobile web,
however, the pageviews per visit is much lower and appears to be dropping.
The following graphs explain:
Daily *desktop* pageviews, by referer 4/13--present (all wikipedias, all
geos). There is strong correlation between pageviews that come from the
outside v. the inside:
[image: Inline image 3]
[image: Inline image 4]
Daily *Mobile web* pageviews, by referer 4/13--present (all wikipedias, all
geos).
[image: Inline image 2]
Compared the 60% we have on desktop, you can see that the ratio is 40% (33%
smaller) on mobile and that this gap has widened (though not in the last 2
years):
[image: Inline image 5]
I don't know if we can explain all of our traffic decreases to the drop in
session length, but it is certainly a big factor. Basically 60% of our
pageviews (internal) shrink by 33% on mobile. So all else being equal, if
we transfer all our traffic to mobile we lose 33% of our pageviews. Right
now we're at 50%. This assumes that there is no change in numbers of
sessions...on which we have no data right now.
*Next Steps:*
Unless I hear otherwise, I think the next steps are to start thinking
through what the implications are.
- Do we try to identify reasons users might be skipping out earlier on
mobile and fixing those?
- Do we try to make it easier for people to find content on mobile?
- Maybe sessions across the internet are just shorter on mobile and we
should focus our efforts on helping people find us?
Regardless, I find this a bit comforting, because having the same number of
users who spend less time would be much better than reaching fewer people:
controlling the experience once they found us is relatively easier than
altering the channels by which people find us in the first place.
Again questions/concerns/suggestions encouraged.
-J
mobile, which is a 15% drop in pageviews (again, assuming a 1:1 traffic
switch).
Hi Team,
I just wanted to update you on the results of something we internally
referred to as the '*browse' *prototype.
TLDR: as implemented the mobile 'browse by category' test did not drive
significant engagement. In fact, as implemented, it seemed inferior to
blue links. However, we started with a very rough and low-impact
prototype, so a few tweaks would give us more definitive results.
Here is the doc from which I am pasting from below:
https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtiz…
Questions/comments welcome!
Best,
J
Browse Prototype Results
Intro
<https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtiz…>
Process
<https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtiz…>
Results
<https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtiz…>
Blue links in general
<https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtiz…>
Category tags
<https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtiz…>
Conclusion and Next Steps
<https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtiz…>
Process
<https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtiz…>
Do people want to browse by categories?
<https://docs.google.com/document/d/1Mqw-awAcp01IcLhHPsHmWsqaAyK1l2-w_LMDtiz…>
Intro
As outlined in this doc
<https://docs.google.com/presentation/d/1ZssE8G0P5WVg8XmkBTi5G3n4OdLHPFGWZDZ…>,
the concept is a tag that allows readers to navigate WP via categories that
are meaningful and populated in order of 'significance' (as determined by
user input). The hypothesis:
-
users will want to navigate by category if there are fewer, more
meaningful categories per page and those category pages showed the most
‘notable’ members first.
Again, see the full doc
<https://docs.google.com/presentation/d/1ZssE8G0P5WVg8XmkBTi5G3n4OdLHPFGWZDZ…>
to understand the premise.
Process
The first step was to validate: do users want to navigate via category? So
we built a very lightweight prototype on mobile web, en wikipedia (stable,
not beta) using hardcoded config variables, in the following categories ( ~4000
pages). Here we did not look into sub-categories with one exception (see
T94732 <https://phabricator.wikimedia.org/T94732> for details). There was
also an error and 2 of the categories did not have tags implemented (struck
through, below)
Category
Pagecount
NBA All Stars
400
American Politicians
818
Object-Oriented Programming Languages
164
European States
24
American Female Pop Singers
326
American drama television series
1048
Modern Painters
983
Landmarks in San Francisco, California
270
Here is how it appeared on the Alcatraz page
When the user clicked the tag, they were taken to a gather-like collection
based on manually estimated relevance
(sorry cropped shot)
The category pages were designed to show the most relevant (as deemed by
me) to the broadest audience, first. Here is the ordering:
https://docs.google.com/spreadsheets/d/12xLXQsH1zcg6E8lDuSonumZNdBvfaBuHOS1…
This was intended to lie in contrast with our current category pages, which
are alphabetical and not really intended for human browsing:
https://en.wikipedia.org/wiki/Category:American_male_film_actors
We primarily measured a few things:
-
when a tag was seen by a user
-
when a tag was clicked on by a user
-
when a page in the new ‘category view’ was clicked on by a user
As a side effort, I looked to see if overall referrals from pages with tags
went up--this was a timed intervention rather than an a/b test and given
the click-thru on the tags, the impact would have been negligible anyway.
This was confirmed by some very noisy results.
Results
Blue links in general
One benefit of the side study mentioned in the previous paragraph is that I
was able to generate a table that looked at the pages in question before we
started the test that shows a ratio of total pageviews/pageviews referred
by a page (estimate of how many links were opened from that page). Though
it is literally just for 0-1 GMT, 6/29/15, now that we have the pageview
hourly table, a more robust analysis can tell us how categories differ in
this regard:
Category
links clicked
#pvs
clicks/pvs
Category:20th-centuryAmericanpoliticians
761
1243
61%
Category:Americandramatelevisionseries
5981
8844
68%
Category:Americanfemalepopsingers
2502
4280
58%
Category:LandmarksinSanFrancisco,
104
287
36%
Category:Modernpainters
136
369
37%
Category:NationalBasketballAssociationAll-Stars
1908
3341
57%
Category:Object-orientedprogramminglanguages
48
181
27%
Category:WesternEurope
657
1221
54%
Grand Total
12099
19766
50%
You can see here that for pages in the category ‘Landmarks in San
Francisco’, if there are 10 pageviews, 5.4 clicks to other pages are
generated on average.
I do not have the original queries for this handy, but can dig them up if
you’re really interested.
Category tags
Full data and queries here:
https://docs.google.com/a/wikimedia.org/spreadsheets/d/1vD3DopxGyeh9FQsuTQD…
The tags themselves generated an average click-through rate of .18%. Given
the overall click thru rate on the pages estimated above ~50%, this single
tag is not driving anything significant. Furthermore, given Leila and
Bob’s paper suggest that this is performing no better than a mid-article
click--given the mobile web sections are collapsed, I would need to
understand more about their method to know just how to interpret their
results against our mobile-web only implementation. Furthermore, our click
through rate used the number of times the tag appeared on screen as the
denominator, whereas their research looked at overall pageviews.
This being noted, the tag was implemented to be as obscure as possible to
establish a baseline. Furthermore, any feature like this would probably be
different in the following ways:
-
each page would be in 1-4 tag groups (as opposed to just 1)
-
each page would be tagged, creating the expectation on the part of the
user that this was something to look for
-
presumably the categories could be implemented as a menu item as opposed
to being buried at the bottom of the page (and competing with features like
read more.
-
Using the learnings from ‘read more’ tags with images or buttons would
likely fare much better.
The follow graph shows:
-
number of impressions on the right axis
-
click-thru-rate on the left-axis.
When you look at click through rates on the ‘category’ pages themselves,
you see that they average at 41% (Chart below) Meaning that for every 10
times a user visited a category page, there were 4.1 clicks to one of those
pages as a result.
Here is the same broken up by category:
Each ‘category’ page here had at least 400 visits, and you can see that the
interest seems to vary dramatically across categories. It is worth noting
that the top three categories here are the ones with the fewest entities.
Each list, however, was capped at ~50 articles, so it is unclear what might
be causing this effect, if it is real.
As mentioned above, the average article page has an overall click rate of
50%. So this page of categories did not have the click-through rate that a
page has. However, this page had summaries of each of the pages, so it
could be that users were generating value beyond what a blue link would
provide. A live-user test of Gather collections, from whom this format was
borrowed, suggested that the format used up too much vertical space on each
article and was hard to flip through. Shortening the amount of text or
image space might be something to try to make the page more useful
Conclusion and Next StepsProcess
-
This was the first time I am aware of that we ran a live prototype and
learn something without building a scalable solution. Win
-
Developer time was estimated at 1 FTE for 2 weeks (by pheudx), but the
chronological time for pushing to stable took a quarter. Room for
improvement
-
The time to analysis was almost 2 quarters, due to a lack of data
analysis support (I ran the initial analysis within 2 weeks of launch,
during paternity leave, but was unable to go back and get it ready to
distribute for 3 months). Room for improvement--possibly solved by
additional Data Analyst.
This experiment was not designed to answer questions definitively in one
round, but with the understanding that multiple iterations would allow us
to fully answer our questions.
The long turn-around time, particularly around analysis and communication,
meant that tweaking a variable to test the conclusions or the new questions
that arosee below will involve a whole lot more work and effort than if we
had been able to explore modifications within a few weeks of the initial
launch.
Do people want to browse by categories?
Category tags at the bottom of the mobile web page in a dull gray
background that lead to manually curated categories are not a killer
feature :)
I would be reluctant to say that this means users are not interested in
browsing by category, however. For instance, it is likely that
-
users did not notice the tag, even if it appeared on screen
-
users are accustomed to our current category tags on desktop and not
interested in that experience
-
users who did like the tag were unlikely to find another page that had
it--there was no feedback mechanism by which the improved category page
would drive additional tag interactions
-
the browse experience created was not ideal
If we decide to pursue what is currently termed “cascade c: update ux”, I
would like to proceed with more tests in this arena, by altering the
appearance and position of the tags, and by improving the flow of the
‘category’ pages. If we choose a different strategy, hopefully other teams
can build off of what was learned here.