Heya, I would suggest to at least run it for a 7 day period so you capture at least the weekly time-trends, increasing the sample size should also be recommendable. We can help setup a udp-filter for this purpose as long as the data can be extracted from the user-agent string.
D On Wed, Sep 4, 2013 at 1:50 PM, Arthur Richards arichards@wikimedia.orgwrote:
Thanks Max for digging into this :)
I'm no analytics guy, but I am a little concerned about the sample size and duration of the internal logging that we've done - sampling 1/10000 for only a few days for data about something we generally know usage to already be low seems to me like it might be difficult to get accurate numbers. Can someone from the analytics team chime in and let us know if the approach is sound and if we should trust the data Max has come up with? This has big implications as it will play role in determining whether or not we continue supporting WAP devices and providing WAP access to the sites.
Thanks everyone!
On Tue, Sep 3, 2013 at 10:40 AM, Erik Zachte ezachte@wikimedia.orgwrote:
Sadly you need to take squid log based reports with a grain of salt. Several incomplete maintenance jobs have taken their toll.
Each report starts with a long list of unsolved bugs. Among those https://bugzilla.wikimedia.org/show_bug.cgi?id=46273
So yeah better trust your own data.
Erik
-----Original Message----- From: analytics-bounces@lists.wikimedia.org [mailto: analytics-bounces@lists.wikimedia.org] On Behalf Of Max Semenik Sent: Tuesday, September 03, 2013 5:33 PM To: analytics@lists.wikimedia.org; Wikimedia developers; mobile-l Subject: [Analytics] Mobile stats
Hi, I have a few questions regarding mobile stats.
I need to determine a real percentage of WAP browsers. At first glance, [1] looks interesting: ratio of text/html to text/vnd.wap.wml is 92M / 3987M = 2.3% on m.wikipedia.org. However, this contradicts the stats at [2] which have different numbers and a different ratio.
I did my own research: because during browser detection in Varnish WAPness is detected mostly by looking at accept header and because our current analytics infrastructure doesn't log it, I quickly whipped up a code that recorded user-agent and accept of every 10,000th request for mobile page views hitting apaches.
According to several days worth of data, out of 14917 logged requests 1445 contained vnd.wap.wml in Accept: headers in any form. That's more than what is logged for frontend responses, however it is expected as WAP should have worse cache hit rate and thus should hit apaches more often.
Next, our WAP detection code is very simple: user-agent is checked against a few major browser IDs (all of them are HTML-capable and this check is not actually needed anymore and will go away soon) and if still not known, we consider every device that sends Accept: header "vnd.wap.wml" (but not "application/vnd.wap.xhtml+xml"), to be WAP-only. If we apply these rules, we get only 68 entries that qualify as WAP which is 0.05% of all mobile requests.
The question is, what's wrong: my research or stats.wikimedia.org?
And if it's indeed just 0.05%, we should probably^W definitely kill WAP support on our mobile site as it's virtually unmaintained.
[1] http://stats.wikimedia.org/wikimedia/squids/SquidReportRequests.htm [2] http://stats.wikimedia.org/wikimedia/squids/SquidReportClients.htm
-- Best regards, Max Semenik ([[User:MaxSem]])
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
-- Arthur Richards Software Engineer, Mobile [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics