Hey all,
The pageviews stored at stats.wikimedia.org and the Vital Signs dashboards showed a substantial drop in pageviews to Wikimedia Commons, primarily from mobile, beginning on 1 January 2015. I was tasked with investigating and I'm reporting what I found so that we have a note of the problems this brings up.
From an investigation of requests to that site at that time, it
appears that this is a perfect storm of known deficiencies in the legacy pageviews definition, fundraising changes, and mobile changes. To summarise:
1. The legacy Pageviews definition contains Special pages, including Special:BannerRandom and Special:HideBanner; 2. The mobile website was historically loading things from Commons in such a way as to trigger calls to Special:HideBanner, which were picked up by the legacy definition as "pageviews to commons"; 3. The Mobile team deployed changes to their image loading setup at the end of December that stopped this from happening, and that coincided with the disabling of the Fundraising primary campaign. 4. The result of this was an apparent massive drop in traffic to Commons from the mobile site - when the actual inaccuracy was the inclusion of that traffic in the first place.
There are several lessons to be learned from this. First, it is worth reiterating the deficiencies and inaccuracies inherent in the legacy pageview definition, many (but certainly not all) of which centre on how it treats the fundraising banners. We are working as rapidly as we can to completely deprecate this definition, replacing it with a new one which is not subject to this kind of variation. We are currently in the middle of performing final QA testing on the new definition: once it is satisfactory, we will deploy it as soon as humanly possible and deprecate the legacy definition.
Second, let me emphasise how critical it is that the teams building MediaWiki and our instances of it - Platform, Operations, Mobile, you name it - keep us in the loop about changes that they make. This was a very dramatic shift in client logic around requests: it flew under our radar. We should have a process in place for letting Analytics know about these changes before they happen so that we do not end up with inaccurate data and a constant game of catchup.
Thanks,
Thanks for taking the time to write this up, Oliver!
Dario, Dan and I are going to work on how we might use Scrum of Scrums to get changes like this on the radar and find a way to communicate the impact of these changes when they happen under the radar.
On Fri, Feb 6, 2015 at 1:04 PM, Oliver Keyes okeyes@wikimedia.org wrote:
Hey all,
The pageviews stored at stats.wikimedia.org and the Vital Signs dashboards showed a substantial drop in pageviews to Wikimedia Commons, primarily from mobile, beginning on 1 January 2015. I was tasked with investigating and I'm reporting what I found so that we have a note of the problems this brings up.
From an investigation of requests to that site at that time, it appears that this is a perfect storm of known deficiencies in the legacy pageviews definition, fundraising changes, and mobile changes. To summarise:
- The legacy Pageviews definition contains Special pages, including
Special:BannerRandom and Special:HideBanner; 2. The mobile website was historically loading things from Commons in such a way as to trigger calls to Special:HideBanner, which were picked up by the legacy definition as "pageviews to commons"; 3. The Mobile team deployed changes to their image loading setup at the end of December that stopped this from happening, and that coincided with the disabling of the Fundraising primary campaign. 4. The result of this was an apparent massive drop in traffic to Commons from the mobile site - when the actual inaccuracy was the inclusion of that traffic in the first place.
There are several lessons to be learned from this. First, it is worth reiterating the deficiencies and inaccuracies inherent in the legacy pageview definition, many (but certainly not all) of which centre on how it treats the fundraising banners. We are working as rapidly as we can to completely deprecate this definition, replacing it with a new one which is not subject to this kind of variation. We are currently in the middle of performing final QA testing on the new definition: once it is satisfactory, we will deploy it as soon as humanly possible and deprecate the legacy definition.
Second, let me emphasise how critical it is that the teams building MediaWiki and our instances of it - Platform, Operations, Mobile, you name it - keep us in the loop about changes that they make. This was a very dramatic shift in client logic around requests: it flew under our radar. We should have a process in place for letting Analytics know about these changes before they happen so that we do not end up with inaccurate data and a constant game of catchup.
Thanks,
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
This is probably me dropping the ball if people mentioned it at the Scrum of Scrums. I'll try to be more attentive in the future.
On Fri, Feb 6, 2015 at 4:29 PM, Grace Gellerman ggellerman@wikimedia.org wrote:
Thanks for taking the time to write this up, Oliver!
Dario, Dan and I are going to work on how we might use Scrum of Scrums to get changes like this on the radar and find a way to communicate the impact of these changes when they happen under the radar.
On Fri, Feb 6, 2015 at 1:04 PM, Oliver Keyes okeyes@wikimedia.org wrote:
Hey all,
The pageviews stored at stats.wikimedia.org and the Vital Signs dashboards showed a substantial drop in pageviews to Wikimedia Commons, primarily from mobile, beginning on 1 January 2015. I was tasked with investigating and I'm reporting what I found so that we have a note of the problems this brings up.
From an investigation of requests to that site at that time, it appears that this is a perfect storm of known deficiencies in the legacy pageviews definition, fundraising changes, and mobile changes. To summarise:
- The legacy Pageviews definition contains Special pages, including
Special:BannerRandom and Special:HideBanner; 2. The mobile website was historically loading things from Commons in such a way as to trigger calls to Special:HideBanner, which were picked up by the legacy definition as "pageviews to commons"; 3. The Mobile team deployed changes to their image loading setup at the end of December that stopped this from happening, and that coincided with the disabling of the Fundraising primary campaign. 4. The result of this was an apparent massive drop in traffic to Commons from the mobile site - when the actual inaccuracy was the inclusion of that traffic in the first place.
There are several lessons to be learned from this. First, it is worth reiterating the deficiencies and inaccuracies inherent in the legacy pageview definition, many (but certainly not all) of which centre on how it treats the fundraising banners. We are working as rapidly as we can to completely deprecate this definition, replacing it with a new one which is not subject to this kind of variation. We are currently in the middle of performing final QA testing on the new definition: once it is satisfactory, we will deploy it as soon as humanly possible and deprecate the legacy definition.
Second, let me emphasise how critical it is that the teams building MediaWiki and our instances of it - Platform, Operations, Mobile, you name it - keep us in the loop about changes that they make. This was a very dramatic shift in client logic around requests: it flew under our radar. We should have a process in place for letting Analytics know about these changes before they happen so that we do not end up with inaccurate data and a constant game of catchup.
Thanks,
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Yes, this has been an issue before. Squid log based reports filter these banners for years, but only after a similar distortion became very apparent, and a lot of data needed to be repaired.
-----Original Message----- From: analytics-bounces@lists.wikimedia.org [mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Oliver Keyes Sent: Friday, February 06, 2015 22:04 To: A mailing list for the Analytics Team at WMF and everybody who has an interest in Wikipedia and analytics. Subject: [Analytics] Drop in Commons mobile traffic - a diagnosis
Hey all,
The pageviews stored at stats.wikimedia.org and the Vital Signs dashboards showed a substantial drop in pageviews to Wikimedia Commons, primarily from mobile, beginning on 1 January 2015. I was tasked with investigating and I'm reporting what I found so that we have a note of the problems this brings up.
From an investigation of requests to that site at that time, it appears that this is a perfect storm of known deficiencies in the legacy pageviews definition, fundraising changes, and mobile changes.
To summarise:
1. The legacy Pageviews definition contains Special pages, including Special:BannerRandom and Special:HideBanner; 2. The mobile website was historically loading things from Commons in such a way as to trigger calls to Special:HideBanner, which were picked up by the legacy definition as "pageviews to commons"; 3. The Mobile team deployed changes to their image loading setup at the end of December that stopped this from happening, and that coincided with the disabling of the Fundraising primary campaign. 4. The result of this was an apparent massive drop in traffic to Commons from the mobile site - when the actual inaccuracy was the inclusion of that traffic in the first place.
There are several lessons to be learned from this. First, it is worth reiterating the deficiencies and inaccuracies inherent in the legacy pageview definition, many (but certainly not all) of which centre on how it treats the fundraising banners. We are working as rapidly as we can to completely deprecate this definition, replacing it with a new one which is not subject to this kind of variation. We are currently in the middle of performing final QA testing on the new definition: once it is satisfactory, we will deploy it as soon as humanly possible and deprecate the legacy definition.
Second, let me emphasise how critical it is that the teams building MediaWiki and our instances of it - Platform, Operations, Mobile, you name it - keep us in the loop about changes that they make. This was a very dramatic shift in client logic around requests: it flew under our radar. We should have a process in place for letting Analytics know about these changes before they happen so that we do not end up with inaccurate data and a constant game of catchup.
Thanks,
-- Oliver Keyes Research Analyst Wikimedia Foundation
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
*nods glumly*. Every day we turn over a rock and find a new ants nest. I'm waiting for the day, probably about 30 seconds after we declare it The Definitive Pageviews Definition, when we start finding flaws in the new one ;p.
I feel like we should append to the end of the current Rules of Analytics something like "Nothing will make you more cynical about a class of metrics than trying to implement them"
On 6 February 2015 at 17:22, Erik Zachte ezachte@wikimedia.org wrote:
Yes, this has been an issue before. Squid log based reports filter these banners for years, but only after a similar distortion became very apparent, and a lot of data needed to be repaired.
-----Original Message----- From: analytics-bounces@lists.wikimedia.org [mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Oliver Keyes Sent: Friday, February 06, 2015 22:04 To: A mailing list for the Analytics Team at WMF and everybody who has an interest in Wikipedia and analytics. Subject: [Analytics] Drop in Commons mobile traffic - a diagnosis
Hey all,
The pageviews stored at stats.wikimedia.org and the Vital Signs dashboards showed a substantial drop in pageviews to Wikimedia Commons, primarily from mobile, beginning on 1 January 2015. I was tasked with investigating and I'm reporting what I found so that we have a note of the problems this brings up.
From an investigation of requests to that site at that time, it appears that this is a perfect storm of known deficiencies in the legacy pageviews definition, fundraising changes, and mobile changes. To summarise:
- The legacy Pageviews definition contains Special pages, including Special:BannerRandom and Special:HideBanner; 2. The mobile website was historically loading things from Commons in such a way as to trigger calls to Special:HideBanner, which were picked up by the legacy definition as "pageviews to commons"; 3. The Mobile team deployed changes to their image loading setup at the end of December that stopped this from happening, and that coincided with the disabling of the Fundraising primary campaign.
- The result of this was an apparent massive drop in traffic to Commons from the mobile site - when the actual inaccuracy was the inclusion of that traffic in the first place.
There are several lessons to be learned from this. First, it is worth reiterating the deficiencies and inaccuracies inherent in the legacy pageview definition, many (but certainly not all) of which centre on how it treats the fundraising banners. We are working as rapidly as we can to completely deprecate this definition, replacing it with a new one which is not subject to this kind of variation. We are currently in the middle of performing final QA testing on the new definition: once it is satisfactory, we will deploy it as soon as humanly possible and deprecate the legacy definition.
Second, let me emphasise how critical it is that the teams building MediaWiki and our instances of it - Platform, Operations, Mobile, you name it - keep us in the loop about changes that they make. This was a very dramatic shift in client logic around requests: it flew under our radar. We should have a process in place for letting Analytics know about these changes before they happen so that we do not end up with inaccurate data and a constant game of catchup.
Thanks,
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
I think an important thing for the new definition is that is easy to update, as any definition we come up with will need updating and maintenance.
On Feb 6, 2015, at 2:56 PM, Oliver Keyes okeyes@wikimedia.org wrote:
*nods glumly*. Every day we turn over a rock and find a new ants nest. I'm waiting for the day, probably about 30 seconds after we declare it The Definitive Pageviews Definition, when we start finding flaws in the new one ;p.
I feel like we should append to the end of the current Rules of Analytics something like "Nothing will make you more cynical about a class of metrics than trying to implement them"
On 6 February 2015 at 17:22, Erik Zachte ezachte@wikimedia.org wrote: Yes, this has been an issue before. Squid log based reports filter these banners for years, but only after a similar distortion became very apparent, and a lot of data needed to be repaired.
-----Original Message----- From: analytics-bounces@lists.wikimedia.org [mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Oliver Keyes Sent: Friday, February 06, 2015 22:04 To: A mailing list for the Analytics Team at WMF and everybody who has an interest in Wikipedia and analytics. Subject: [Analytics] Drop in Commons mobile traffic - a diagnosis
Hey all,
The pageviews stored at stats.wikimedia.org and the Vital Signs dashboards showed a substantial drop in pageviews to Wikimedia Commons, primarily from mobile, beginning on 1 January 2015. I was tasked with investigating and I'm reporting what I found so that we have a note of the problems this brings up.
From an investigation of requests to that site at that time, it appears that this is a perfect storm of known deficiencies in the legacy pageviews definition, fundraising changes, and mobile changes. To summarise:
- The legacy Pageviews definition contains Special pages, including Special:BannerRandom and Special:HideBanner; 2. The mobile website was historically loading things from Commons in such a way as to trigger calls to Special:HideBanner, which were picked up by the legacy definition as "pageviews to commons"; 3. The Mobile team deployed changes to their image loading setup at the end of December that stopped this from happening, and that coincided with the disabling of the Fundraising primary campaign.
- The result of this was an apparent massive drop in traffic to Commons from the mobile site - when the actual inaccuracy was the inclusion of that traffic in the first place.
There are several lessons to be learned from this. First, it is worth reiterating the deficiencies and inaccuracies inherent in the legacy pageview definition, many (but certainly not all) of which centre on how it treats the fundraising banners. We are working as rapidly as we can to completely deprecate this definition, replacing it with a new one which is not subject to this kind of variation. We are currently in the middle of performing final QA testing on the new definition: once it is satisfactory, we will deploy it as soon as humanly possible and deprecate the legacy definition.
Second, let me emphasise how critical it is that the teams building MediaWiki and our instances of it - Platform, Operations, Mobile, you name it - keep us in the loop about changes that they make. This was a very dramatic shift in client logic around requests: it flew under our radar. We should have a process in place for letting Analytics know about these changes before they happen so that we do not end up with inaccurate data and a constant game of catchup.
Thanks,
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Thanks for the detailed analysis Oliver.
I think the page view definition is always going to be somewhat fungible. I suspect that when people start watching the impact of new features on the numbers, we'll get more attention.
-Toby
On Fri, Feb 6, 2015 at 3:05 PM, Nuria nuria@wikimedia.org wrote:
I think an important thing for the new definition is that is easy to update, as any definition we come up with will need updating and maintenance.
On Feb 6, 2015, at 2:56 PM, Oliver Keyes okeyes@wikimedia.org wrote:
*nods glumly*. Every day we turn over a rock and find a new ants nest. I'm waiting for the day, probably about 30 seconds after we declare it The Definitive Pageviews Definition, when we start finding flaws in the new one ;p.
I feel like we should append to the end of the current Rules of Analytics something like "Nothing will make you more cynical about a class of metrics than trying to implement them"
On 6 February 2015 at 17:22, Erik Zachte ezachte@wikimedia.org wrote: Yes, this has been an issue before. Squid log based reports filter
these banners for years, but only after a similar distortion became very apparent, and a lot of data needed to be repaired.
-----Original Message----- From: analytics-bounces@lists.wikimedia.org [mailto:
analytics-bounces@lists.wikimedia.org] On Behalf Of Oliver Keyes
Sent: Friday, February 06, 2015 22:04 To: A mailing list for the Analytics Team at WMF and everybody who has
an interest in Wikipedia and analytics.
Subject: [Analytics] Drop in Commons mobile traffic - a diagnosis
Hey all,
The pageviews stored at stats.wikimedia.org and the Vital Signs
dashboards showed a substantial drop in pageviews to Wikimedia Commons, primarily from mobile, beginning on 1 January 2015. I was tasked with investigating and I'm reporting what I found so that we have a note of the problems this brings up.
From an investigation of requests to that site at that time, it appears
that this is a perfect storm of known deficiencies in the legacy pageviews definition, fundraising changes, and mobile changes.
To summarise:
- The legacy Pageviews definition contains Special pages, including
Special:BannerRandom and Special:HideBanner; 2. The mobile website was historically loading things from Commons in such a way as to trigger calls to Special:HideBanner, which were picked up by the legacy definition as "pageviews to commons"; 3. The Mobile team deployed changes to their image loading setup at the end of December that stopped this from happening, and that coincided with the disabling of the Fundraising primary campaign.
- The result of this was an apparent massive drop in traffic to
Commons from the mobile site - when the actual inaccuracy was the inclusion of that traffic in the first place.
There are several lessons to be learned from this. First, it is worth
reiterating the deficiencies and inaccuracies inherent in the legacy pageview definition, many (but certainly not all) of which centre on how it treats the fundraising banners. We are working as rapidly as we can to completely deprecate this definition, replacing it with a new one which is not subject to this kind of variation. We are currently in the middle of performing final QA testing on the new definition:
once it is satisfactory, we will deploy it as soon as humanly possible
and deprecate the legacy definition.
Second, let me emphasise how critical it is that the teams building
MediaWiki and our instances of it - Platform, Operations, Mobile, you name it - keep us in the loop about changes that they make. This was a very dramatic shift in client logic around requests: it flew under our radar. We should have a process in place for letting Analytics know about these changes before they happen so that we do not end up with inaccurate data and a constant game of catchup.
Thanks,
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
I think the page view definition is always going to be somewhat fungible. I suspect that when people start watching the impact of new features on the numbers, we'll get more attention.
We're digitizing reality, perfect is not an option. Nuria's totally right - agility is what we should strive for.
I raised this at the scrum of scrums today. The thing to realize is that it's very hard to communicate about these kinds of changes cross-functionally. So Ryan, who represents Mobile, was not aware of the change that affected hits to Special:HiddenBanner. I asked if people could think about this as an example and use it to inform the kinds of things they bring up at the scrum of scrums in the future, but we all have to realize that this is hard to do.
One approach would be to create a "Maybe-Analytics" project in phabricator and have people liberally use it. We could then take turns going through that queue and untagging tasks if we don't see a connection or have completed our investigation. Kind of a higher volume lower fidelity version of scrum of scrums that we could incorporate into the organization's daily mode of operation [1].
[1] yes I don't like speaking in dead languages :P
On Sat, Feb 7, 2015 at 2:06 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Oliver Keyes, 06/02/2015 22:04:
- Platform, Operations, Mobile, you
name it - keep us in the loop about changes that they make.
Or, you know, made their code load stuff from w/index.php.
Nemo
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Thanks for raising this issue at Scrum of Scrums! I am a big fan of cross-functional communication ;)
I like the idea of the Maybe Analytics proj in Phab.
On Wed, Feb 11, 2015 at 11:05 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
I raised this at the scrum of scrums today. The thing to realize is that it's very hard to communicate about these kinds of changes cross-functionally. So Ryan, who represents Mobile, was not aware of the change that affected hits to Special:HiddenBanner. I asked if people could think about this as an example and use it to inform the kinds of things they bring up at the scrum of scrums in the future, but we all have to realize that this is hard to do.
One approach would be to create a "Maybe-Analytics" project in phabricator and have people liberally use it. We could then take turns going through that queue and untagging tasks if we don't see a connection or have completed our investigation. Kind of a higher volume lower fidelity version of scrum of scrums that we could incorporate into the organization's daily mode of operation [1].
[1] yes I don't like speaking in dead languages :P
On Sat, Feb 7, 2015 at 2:06 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Oliver Keyes, 06/02/2015 22:04:
- Platform, Operations, Mobile, you
name it - keep us in the loop about changes that they make.
Or, you know, made their code load stuff from w/index.php.
Nemo
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics