Hi all,


this resumes the usual look at our most important readership metrics. This edition of the report introduces a new year-over-year comparison chart to better understand which traffic changes are seasonal, and highlights a recent browser-related issue that has, suddenly and probably artificially, increased our global usage by 11%. We also take a first look of how the ratio of pageviews per device has been developing over time, and point out some other events that had an impact on core metrics.

As laid out earlier, the main purpose is to raise awareness about how these are developing, call out the impact of any unusual events, and facilitate thinking about core metrics in general. As always; feedback and discussion welcome. Week-over-week and month-over-month changes are now being recorded on the Product page at MediaWiki.org. This edition of the report covers a timespan of 14 weeks.


Some other recent items of interest, in case they didn’t already catch your attention:

Now to the usual data. (All numbers below are averages for April 25-July 31, 2016 unless otherwise noted.)


Pageviews

Total: 511 million/day (-5.0% from the previous report)

Context (April 2015-July 2016):


See also the Vital Signs dashboard

(Small caveat: Android app pageviews in April and early May were undercounted by up to 1-2 million/dayl due to a bug.)


Overall pageviews developed relatively uneventfully during the timespan of this report until around July 20, when the desktop traffic to the main pages of several large Wikipedias (including English, Russian and Dutch) suddenly increased drastically. The investigation is still ongoing. At this point though it seems likely that this is not due an undetected spider or such, but caused by browsers of real users behaving in an unusual way, namely an old version of Chrome on Windows. Overall pageviews for all projects appear to have increased around 11% due to this.


To improve our understanding of which traffic movements are seasonal and which may indicate lasting changes, I have pieced together a chart overlaying the total pageview numbers back to May 2013 (the earliest time for which we have data according to the current pageview definition):

For example, we talked before about how the large year-over-year decrease seen in earlier months (e.g. -11.9% from April 2015 to April 2016) could probably be attributed to two one-time events in May/June 2015: the switch to HTTPS-only connections and the block of the Chinese Wikipedia in China. At the time this was based on more indirect evidence, in particular changes in the ratio of pageviews from the Global North. Now that a year has passed, we can directly compare data from after these events year-over-year. It confirms that conclusion (in the chart, compare the red line against the other three on the left hand side). In July 2016 there actually was a 1.1% increase compared to July 2015. Of course this is partly due to the aforementioned main pages anomaly, which is also visible clearly in this chart.



Desktop: 54.1% ​(previous report: ​55.2%)

Mobile web: 44.6% ​(previous report: 43.6%)

Apps: 1.3% ​(previous report: 1.2%)

(small caveat: app percentage is a bit too low due to the aforementioned Android-related bug in April/May)


Context (April 2015-July 2016):

After largely stagnating earlier this year following the christmas bump, the mobile percentage started to rise again towards parity around May (reaching 48.6% in the week until July 11). However, the recent anomalously high main pages traffic described above was entirely on desktop and has thus, perhaps temporarily, brought this percentage down again (to 42.7% in the week until July 31). As a reminder, mobile already has a solid majority in terms of unique devices, cf. below.


Global North ratio: 75.5% of total pageviews (previous report: 76.5%)


Context (last seven months):

After a notable decrease (or, increase in the Global South ratio) at the beginning of the year, this number was relatively steady over the timespan of this report.

NB: We are currently rethinking this metric and might replace it with a different country selection constructed as part of the work on the New Readers project.

Unique devices

See the announcement blog post from March for background and details on this recently introduced metric. These estimated numbers are provided for all Wikimedia language projects (separately for the desktop and mobile web version). Because of the instrumentation method, there is no global metric for all projects and all languages, but following some recent discussions it is now planned to extend it to a cross-language global metric per project at least. For now, we track the daily numbers of the English Wikipedia in this report.


Daily unique devices estimate for English Wikipedia:



Context (January-August 2016):


See also the new Vital Signs dashboard

As mentioned in the previous report, there was a drop in early to mid March which hasn’t seen an explanation so far. This metric hasn’t existed long enough yet to get a good sense of what yearly seasonalities may exist, but it looks a bit like a decreasing trend in desktop devices during these seven months. Interestingly though, pageviews did not decrease as much. Or to put it differently, pageviews per device rose during this time (even if one disregards the aforementioned main page increase since the end of July):



New app installations


Android:19.7k/day (-37.7% from the previous report)

Daily installs per device, from Google Play


Context (last six months):

As described in the previous report, in March a Google bug had accidentally increased visitor number to the app’s Play Store listing enormously (spike cut off in this chart). So this it’s not surprising that the average install rate over the timespan of this report is much lower. Even disregarding this factor though, the install rates have been decreasing over the last few months and are matched by uninstalls on most days (except for a bump during two weeks in July whose reason is unknown to me). Combined with devices going inactive etc. this has meant that the app’s install base has been shrinking, from from 15.5 million on March 31 to 15.1 million today.

On July 28, an updated version of the app with several exciting new features was launched, with a blog post that had very high pageview numbers and was picked up by media outlets in various countries. Unfortunately this public attention has not translated into a lot of downloads - we estimate around 3000 (a few hours’ worth at the normal rate). In comparison, the relaunch of the OS app in March brought roughly 50k additional installs (see previous report).



iOS: 3.18 k/day (-37.2% from the previous report)

Download numbers from App Annie


Context (last six months):

As mentioned, the iOS app’s huge relaunch fell into the timespan of the previous report, so the decrease here is not surprising.

App user retention


Android: 16.7% (previous report: 15.4%)

(Ratio of app installs opened again 7 days after installation, among all installed one week before a date that falls within this report. 1:100 sample)

Context (last six months):

As remarked in earlier reports, this data is a bit too noisy for drawing conclusions about whether retention changed significantly between different releases. But we can at least rule out the existence of major shifts during this timespan.


iOS: N/A


Unique app users


Android: 1.145 million / day (-5.0% from the previous report timespan, reconstructed)


Context (last six months):

Continuing the slight but steady decline that had been interrupted by the aforementioned Google bug windfall in March.


iOS: N/A


While we don’t have total active user numbers available for the iOS app any more, let’s mention an interesting increase that happened in the app’s pageviews back in May, by around 20%. (Because of the app’s small share in overall traffic, it did not affect the global pageviews discussed above in a notable way.) After some investigation, it turned out this coincided with the release of version 5.0.3 of the app, see the yellow slice on the right:

Josh and I are planning to look further into possible causes; at the moment it’s still possible that this is either an artefact or due to real improvements in the app.

----

For reference, the queries and source links used are listed below (access is needed for each). Unless otherwise noted, all content of this report is © Wikimedia Foundation and released under the CC BY-SA 3.0 license. Most of the above charts are available on Commons, too; together with PDF versions of this report.



SELECT year, month, day, CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) as date, sum(IF(access_method <> 'desktop', view_count, null)) AS mobileviews, SUM(view_count) AS allviews FROM wmf.projectview_hourly WHERE year>0 AND agent_type = 'user' GROUP BY year, month, day ORDER BY year, month, day LIMIT 1000;


SELECT LEFT(timestamp, 10) AS date, sum(IF(access_method <> 'desktop', pageviews, null)) AS mobileviews, SUM(pageviews) AS allviews FROM staging.pageviews05 WHERE is_spider = FALSE AND is_automata = FALSE GROUP BY date;


SELECT access_method, SUM(view_count)/(7*14) FROM wmf.projectview_hourly WHERE agent_type = 'user' AND CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) BETWEEN "2016-04-25" AND "2016-07-31" GROUP BY access_method;


SELECT year, month, day, CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")), SUM(view_count) AS all, SUM(IF (FIND_IN_SET(country_code, 'AD,AL,AT,AX,BA,BE,BG,CH,CY,CZ,DE,DK,EE,ES,FI,FO,FR,FX,GB,GG,GI,GL,GR,HR,HU,IE,IL,IM,IS,IT,JE,LI,LU,LV,MC,MD,ME,MK,MT,NL,NO,PL,PT,RO,RS,RU,SE,SI,SJ,SK,SM,TR,VA,AU,CA,HK,MO,NZ,JP,SG,KR,TW,US') > 0, view_count, 0)) AS Global_North_views FROM wmf.projectview_hourly WHERE year > 0 AND agent_type='user' GROUP BY year, month, day ORDER BY year, month, day LIMIT 1000;


https://console.developers.google.com/storage/browser/pubsite_prod_rev_02812522755211381933/stats/installs/ (“overview”)


https://www.appannie.com/dashboard/252257/item/324715238/downloads/ (select “Total”)


SELECT LEFT(timestamp, 8) AS date, SUM(IF(event_appInstallAgeDays = 0, 1, 0)) AS day0_active, SUM(IF(event_appInstallAgeDays = 7, 1, 0)) AS day7_active FROM log.MobileWikiAppDailyStats_12637385 WHERE userAgent LIKE '%-r-%' AND userAgent NOT LIKE '%Googlebot%' GROUP BY date ORDER BY DATE;

(with the retention rate calculated as day7_active divided by day0_active from seven days earlier, of course)


SELECT CONCAT(year,"-",LPAD(month,2,"0"),"-",LPAD(day,2,"0")) as date, unique_count AS Android_DAU FROM wmf.mobile_apps_uniques_daily WHERE platform = 'Android';


SELECT year, month, day, CONCAT(year,'-',LPAD(month,2,'0'),'-',LPAD(day,2,'0')) AS date,

CONCAT(SPLIT(user_agent_map['wmf_app_version'], '\\.')[0],

COALESCE( CONCAT('.', SPLIT(user_agent_map['wmf_app_version'], '\\.')[1]), ''),

COALESCE( CONCAT('.', SPLIT(user_agent_map['wmf_app_version'], '\\.')[2]), ''))

AS app_version,

SUM(view_count) AS views

FROM wmf.pageview_hourly

WHERE year >0 AND access_method = 'mobile app'

AND user_agent_map['os_family'] = 'iOS' AND agent_type = 'user'

GROUP BY year, month, day, user_agent_map['wmf_app_version']

ORDER BY year, month, day, app_version LIMIT 20000;


--
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB