Thanks Max for digging into this :)
I'm no analytics guy, but I am a little concerned about the sample size and
duration of the internal logging that we've done - sampling 1/10000 for
only a few days for data about something we generally know usage to already
be low seems to me like it might be difficult to get accurate numbers. Can
someone from the analytics team chime in and let us know if the approach is
sound and if we should trust the data Max has come up with? This has big
implications as it will play role in determining whether or not we continue
supporting WAP devices and providing WAP access to the sites.
On Tue, Sep 3, 2013 at 10:40 AM, Erik Zachte <ezachte(a)wikimedia.org> wrote:
Sadly you need to take squid log based reports with a
grain of salt.
Several incomplete maintenance jobs have taken their toll.
Each report starts with a long list of unsolved bugs.
Among those https://bugzilla.wikimedia.org/show_bug.cgi?id=46273
So yeah better trust your own data.
From: analytics-bounces(a)lists.wikimedia.org [mailto:
analytics-bounces(a)lists.wikimedia.org] On Behalf Of Max Semenik
Sent: Tuesday, September 03, 2013 5:33 PM
To: analytics(a)lists.wikimedia.org; Wikimedia developers; mobile-l
Subject: [Analytics] Mobile stats
Hi, I have a few questions regarding mobile stats.
I need to determine a real percentage of WAP browsers. At first glance,
 looks interesting: ratio of text/html to text/vnd.wap.wml is 92M /
3987M = 2.3% on m.wikipedia.org
. However, this contradicts the stats at
 which have different numbers and a different ratio.
I did my own research: because during browser detection in Varnish WAPness
is detected mostly by looking at accept header and because our current
analytics infrastructure doesn't log it, I quickly whipped up a code that
recorded user-agent and accept of every 10,000th request for mobile page
views hitting apaches.
According to several days worth of data, out of 14917 logged requests
1445 contained vnd.wap.wml in Accept: headers in any form. That's more
than what is logged for frontend responses, however it is expected as WAP
should have worse cache hit rate and thus should hit apaches more often.
Next, our WAP detection code is very simple: user-agent is checked against
a few major browser IDs (all of them are HTML-capable and this check is not
actually needed anymore and will go away soon) and if still not known, we
consider every device that sends Accept:
header "vnd.wap.wml" (but not "application/vnd.wap.xhtml+xml"), to
WAP-only. If we apply these rules, we get only 68 entries that qualify as
WAP which is 0.05% of all mobile requests.
The question is, what's wrong: my research or stats.wikimedia.org?
And if it's indeed just 0.05%, we should probably^W definitely kill WAP
support on our mobile site as it's virtually unmaintained.
Max Semenik ([[User:MaxSem]])
Analytics mailing list
Mobile-l mailing list
Software Engineer, Mobile