According to Dan Garry, the Apps team is now sending sections=all instead of sections=0 on recent iOS app requests. The result is that apps will be underreported, since the existing implementation of the pageview definition does not know this.[0]
I've filed a phabricator ticket,[1] but this is just a note to make sure it's surfaced more widely - pageviews-based requests using the "New" definition are not currently reliable for apps. This is the Nth reminder to !analytics that if you're planning on (a) asking analytics for data and (b) getting useful numbers, it's probably nice to tell them about this sort of change /before/ you make it.
[0] https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-... [1] https://phabricator.wikimedia.org/T93255
Thanks for sending this, Oliver!
I'll make sure we send quick notes to this list in future for anything we think may affect reports from Analytics.
Dan
On 19 March 2015 at 12:49, Oliver Keyes okeyes@wikimedia.org wrote:
According to Dan Garry, the Apps team is now sending sections=all instead of sections=0 on recent iOS app requests. The result is that apps will be underreported, since the existing implementation of the pageview definition does not know this.[0]
I've filed a phabricator ticket,[1] but this is just a note to make sure it's surfaced more widely - pageviews-based requests using the "New" definition are not currently reliable for apps. This is the Nth reminder to !analytics that if you're planning on (a) asking analytics for data and (b) getting useful numbers, it's probably nice to tell them about this sort of change /before/ you make it.
[0] https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-... [1] https://phabricator.wikimedia.org/T93255
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Thanks! Much appreciated :). A big part of the mea culpa with this one is mine - we had that thread about resolving this for UUIDs, and I was working on pageviews today and went "hmn, I wonder why this isn'-wait. Didn't we have a conversation about this? *pulls up thread* oh %&$^"
(At which point I headdesked at myself ;p)
So: !Analytics, send emails. And Analytics, go "huh, I should....probably...fix this everywhere else too". Or you, too, will headdesk at yourself, and desks are /expensive/ ;p
On 19 March 2015 at 15:54, Dan Garry dgarry@wikimedia.org wrote:
Thanks for sending this, Oliver!
I'll make sure we send quick notes to this list in future for anything we think may affect reports from Analytics.
Dan
On 19 March 2015 at 12:49, Oliver Keyes okeyes@wikimedia.org wrote:
According to Dan Garry, the Apps team is now sending sections=all instead of sections=0 on recent iOS app requests. The result is that apps will be underreported, since the existing implementation of the pageview definition does not know this.[0]
I've filed a phabricator ticket,[1] but this is just a note to make sure it's surfaced more widely - pageviews-based requests using the "New" definition are not currently reliable for apps. This is the Nth reminder to !analytics that if you're planning on (a) asking analytics for data and (b) getting useful numbers, it's probably nice to tell them about this sort of change /before/ you make it.
[0] https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-... [1] https://phabricator.wikimedia.org/T93255
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Dan Garry Associate Product Manager, Mobile Apps Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On 19 March 2015 at 13:01, Oliver Keyes okeyes@wikimedia.org wrote:
Thanks! Much appreciated :). A big part of the mea culpa with this one is mine - we had that thread about resolving this for UUIDs, and I was working on pageviews today and went "hmn, I wonder why this isn'-wait. Didn't we have a conversation about this? *pulls up thread* oh %&$^"
I think this a shared responsibility, actually. The reason we changed from sections=0 to sections=all was an emergency hack to fix the worst bug we've ever encountered. The bug would irrecoverably freeze the app on iOS 7 (the majority of our user base) when you went to commonly viewed pages (e.g. Barack Obama), and the only way to unfreeze it was to totally reinstall the app. We had our heads down in the problem so much that we forgot to reach out about it. Understandable, I think. Also not really excusable.
So, I think the Mobile Apps Team can be more timely about this sort of thing in the future, and I'll try to make sure that we are. :-)
Dan
Note that sections=all should always be considered a pageview no matter what, because that's what it is:)
On Thu, Mar 19, 2015 at 12:49 PM, Oliver Keyes okeyes@wikimedia.org wrote:
According to Dan Garry, the Apps team is now sending sections=all instead of sections=0 on recent iOS app requests. The result is that apps will be underreported, since the existing implementation of the pageview definition does not know this.[0]
I've filed a phabricator ticket,[1] but this is just a note to make sure it's surfaced more widely - pageviews-based requests using the "New" definition are not currently reliable for apps. This is the Nth reminder to !analytics that if you're planning on (a) asking analytics for data and (b) getting useful numbers, it's probably nice to tell them about this sort of change /before/ you make it.
[0] https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-... [1] https://phabricator.wikimedia.org/T93255
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On 19 March 2015 at 13:11, Max Semenik maxsem.wiki@gmail.com wrote:
Note that sections=all should always be considered a pageview no matter what, because that's what it is:)
For the apps, not necessarily!
The apps use a sections=all query to refresh saved pages, because that way we've only got the overhead of a single HTTP request per page. Since the user isn't actually viewing the page when it's requested, a decision was made not to count it as a "page view". That's the reason why we stuck to sections=0 for the apps, because it was a unique API signature that corresponded to normal browsing but *not* to saved pages refreshing.
If you have a better suggestion for any of this, let me know and we can talk about that.
Dan
Wait. So, sections=all is only /sometimes/ a pageview? ;) When?
On 19 March 2015 at 16:15, Dan Garry dgarry@wikimedia.org wrote:
On 19 March 2015 at 13:11, Max Semenik maxsem.wiki@gmail.com wrote:
Note that sections=all should always be considered a pageview no matter what, because that's what it is:)
For the apps, not necessarily!
The apps use a sections=all query to refresh saved pages, because that way we've only got the overhead of a single HTTP request per page. Since the user isn't actually viewing the page when it's requested, a decision was made not to count it as a "page view". That's the reason why we stuck to sections=0 for the apps, because it was a unique API signature that corresponded to normal browsing but not to saved pages refreshing.
If you have a better suggestion for any of this, let me know and we can talk about that.
Dan
Dan Garry Associate Product Manager, Mobile Apps Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On 19 March 2015 at 13:19, Oliver Keyes okeyes@wikimedia.org wrote:
Wait. So, sections=all is only /sometimes/ a pageview? ;) When?
If we're talking about Android, it's my understanding that nothing has changed from before. sections=0 is a page view. sections=all is not because it's used for saved pages updates.
If we're talking about iOS, as of right now, both sections=all or sections=0 are page views. Neither kind of query is used for anything else, because saved pages updates are not in production.
Our next release for iOS, targeted for 30th March, will include a saved pages update feature. I'm unsure of the structure of the query used to refresh saved pages. The iOS tech lead, Adam, should be able to shed light on this.
Dan
Okay. So, we're going to have to add an element of UA detection, then! That...should be doable pretty trivially.
I have to make some changes to just this function for the session analysis (namely, breaking it out of its current scope and writing a UDF around it) so I'll incorporate this work into that.
On 19 March 2015 at 16:32, Dan Garry dgarry@wikimedia.org wrote:
On 19 March 2015 at 13:19, Oliver Keyes okeyes@wikimedia.org wrote:
Wait. So, sections=all is only /sometimes/ a pageview? ;) When?
If we're talking about Android, it's my understanding that nothing has changed from before. sections=0 is a page view. sections=all is not because it's used for saved pages updates.
If we're talking about iOS, as of right now, both sections=all or sections=0 are page views. Neither kind of query is used for anything else, because saved pages updates are not in production.
Our next release for iOS, targeted for 30th March, will include a saved pages update feature. I'm unsure of the structure of the query used to refresh saved pages. The iOS tech lead, Adam, should be able to shed light on this.
Dan
-- Dan Garry Associate Product Manager, Mobile Apps Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
(What I'd like from the Apps team to make this as efficient as possible:
1. Example URLs/UAs for each possible permutation {{sections=0,sections=all}{iOS/Android}}; 2. An idea (as suggested by Dan) of what the auto-update will mean. Is Adam on this list or should I reach out distinctly?) 3. Links to the pertinent gerrit patches that instituted the auto-update thing and the iOS bugfix, so I can establish (in the logs of the pageview def update) what the timeline is/was like here. )
On 19 March 2015 at 16:33, Oliver Keyes okeyes@wikimedia.org wrote:
Okay. So, we're going to have to add an element of UA detection, then! That...should be doable pretty trivially.
I have to make some changes to just this function for the session analysis (namely, breaking it out of its current scope and writing a UDF around it) so I'll incorporate this work into that.
On 19 March 2015 at 16:32, Dan Garry dgarry@wikimedia.org wrote:
On 19 March 2015 at 13:19, Oliver Keyes okeyes@wikimedia.org wrote:
Wait. So, sections=all is only /sometimes/ a pageview? ;) When?
If we're talking about Android, it's my understanding that nothing has changed from before. sections=0 is a page view. sections=all is not because it's used for saved pages updates.
If we're talking about iOS, as of right now, both sections=all or sections=0 are page views. Neither kind of query is used for anything else, because saved pages updates are not in production.
Our next release for iOS, targeted for 30th March, will include a saved pages update feature. I'm unsure of the structure of the query used to refresh saved pages. The iOS tech lead, Adam, should be able to shed light on this.
Dan
-- Dan Garry Associate Product Manager, Mobile Apps Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
I've added Dmitry and Bernd to this thread. Guys, you probably want to get on this list if you're not already on it.
Here's the iOS part for refreshing (patches 1 https://gerrit.wikimedia.org/r/#/c/191359/, 2 https://gerrit.wikimedia.org/r/#/c/192973/, 3 https://gerrit.wikimedia.org/r/#/c/192993/, 4 https://gerrit.wikimedia.org/r/#/c/196080/). Suppose I have "Derek Charke" and "Darya Khan railway station" in my Saved pages list. Then when I tap the refresh button, the following requests are made automatically, with a User-Agent header of the following form:
User-Agent: WikipediaApp/4.0.7 (iPhone OS 8.2; Phone)
Requests...these use the ArticleFetcher class, which is the same class used for fetching articles even outside of a Saved pages refresh (i.e., garden variety article browsing).
Derek Charke
https://en.m.wikipedia.org/w/api.php?action=mobileview&appInstallID= <somevalue>&format=json&noheadings=true&page=Derek%20Charke&prop=sections%7Ctext%7Clastmodified%7Clastmodifiedby%7Clanguagecount%7Cid%7Cprotection%7Ceditable%7Cdisplaytitle%7Cthumb%7Cdescription%7Cimage§ionprop=toclevel%7Cline%7Canchor%7Clevel%7Cnumber%7Cfromtitle%7Cindex§ions=all&thumbwidth=640
Darya Khan railway station
https://en.m.wikipedia.org/w/api.php?action=mobileview&appInstallID= <somevalue>&format=json&noheadings=true&page=Darya%20Khan%20railway%20station&prop=sections%7Ctext%7Clastmodified%7Clastmodifiedby%7Clanguagecount%7Cid%7Cprotection%7Ceditable%7Cdisplaytitle%7Cthumb%7Cdescription%7Cimage§ionprop=toclevel%7Cline%7Canchor%7Clevel%7Cnumber%7Cfromtitle%7Cindex§ions=all&thumbwidth=640
On Thu, Mar 19, 2015 at 1:36 PM, Oliver Keyes okeyes@wikimedia.org wrote:
(What I'd like from the Apps team to make this as efficient as possible:
- Example URLs/UAs for each possible permutation
{{sections=0,sections=all}{iOS/Android}}; 2. An idea (as suggested by Dan) of what the auto-update will mean. Is Adam on this list or should I reach out distinctly?) 3. Links to the pertinent gerrit patches that instituted the auto-update thing and the iOS bugfix, so I can establish (in the logs of the pageview def update) what the timeline is/was like here. )
On 19 March 2015 at 16:33, Oliver Keyes okeyes@wikimedia.org wrote:
Okay. So, we're going to have to add an element of UA detection, then! That...should be doable pretty trivially.
I have to make some changes to just this function for the session analysis (namely, breaking it out of its current scope and writing a UDF around it) so I'll incorporate this work into that.
On 19 March 2015 at 16:32, Dan Garry dgarry@wikimedia.org wrote:
On 19 March 2015 at 13:19, Oliver Keyes okeyes@wikimedia.org wrote:
Wait. So, sections=all is only /sometimes/ a pageview? ;) When?
If we're talking about Android, it's my understanding that nothing has changed from before. sections=0 is a page view. sections=all is not
because
it's used for saved pages updates.
If we're talking about iOS, as of right now, both sections=all or
sections=0
are page views. Neither kind of query is used for anything else, because saved pages updates are not in production.
Our next release for iOS, targeted for 30th March, will include a saved pages update feature. I'm unsure of the structure of the query used to refresh saved pages. The iOS tech lead, Adam, should be able to shed
light
on this.
Dan
-- Dan Garry Associate Product Manager, Mobile Apps Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Oh, and the patch that introduced the sections=all stuff...here's the change from what I recall.
https://gerrit.wikimedia.org/r/#/c/176491/2/wikipedia/Networking/Fetchers/Ar...
+Bernd, Dmitry, in case they weren't added on earlier email from some reason. Please see below. +Monte
On Thu, Mar 19, 2015 at 6:18 PM, Adam Baso abaso@wikimedia.org wrote:
I've added Dmitry and Bernd to this thread. Guys, you probably want to get on this list if you're not already on it.
Here's the iOS part for refreshing (patches 1 https://gerrit.wikimedia.org/r/#/c/191359/, 2 https://gerrit.wikimedia.org/r/#/c/192973/, 3 https://gerrit.wikimedia.org/r/#/c/192993/, 4 https://gerrit.wikimedia.org/r/#/c/196080/). Suppose I have "Derek Charke" and "Darya Khan railway station" in my Saved pages list. Then when I tap the refresh button, the following requests are made automatically, with a User-Agent header of the following form:
User-Agent: WikipediaApp/4.0.7 (iPhone OS 8.2; Phone)
Requests...these use the ArticleFetcher class, which is the same class used for fetching articles even outside of a Saved pages refresh (i.e., garden variety article browsing).
Derek Charke
https://en.m.wikipedia.org/w/api.php?action=mobileview&appInstallID= <somevalue>&format=json&noheadings=true&page=Derek%20Charke&prop=sections%7Ctext%7Clastmodified%7Clastmodifiedby%7Clanguagecount%7Cid%7Cprotection%7Ceditable%7Cdisplaytitle%7Cthumb%7Cdescription%7Cimage§ionprop=toclevel%7Cline%7Canchor%7Clevel%7Cnumber%7Cfromtitle%7Cindex§ions=all&thumbwidth=640
Darya Khan railway station
https://en.m.wikipedia.org/w/api.php?action=mobileview&appInstallID= <somevalue>&format=json&noheadings=true&page=Darya%20Khan%20railway%20station&prop=sections%7Ctext%7Clastmodified%7Clastmodifiedby%7Clanguagecount%7Cid%7Cprotection%7Ceditable%7Cdisplaytitle%7Cthumb%7Cdescription%7Cimage§ionprop=toclevel%7Cline%7Canchor%7Clevel%7Cnumber%7Cfromtitle%7Cindex§ions=all&thumbwidth=640
On Thu, Mar 19, 2015 at 1:36 PM, Oliver Keyes okeyes@wikimedia.org wrote:
(What I'd like from the Apps team to make this as efficient as possible:
- Example URLs/UAs for each possible permutation
{{sections=0,sections=all}{iOS/Android}}; 2. An idea (as suggested by Dan) of what the auto-update will mean. Is Adam on this list or should I reach out distinctly?) 3. Links to the pertinent gerrit patches that instituted the auto-update thing and the iOS bugfix, so I can establish (in the logs of the pageview def update) what the timeline is/was like here. )
On 19 March 2015 at 16:33, Oliver Keyes okeyes@wikimedia.org wrote:
Okay. So, we're going to have to add an element of UA detection, then! That...should be doable pretty trivially.
I have to make some changes to just this function for the session analysis (namely, breaking it out of its current scope and writing a UDF around it) so I'll incorporate this work into that.
On 19 March 2015 at 16:32, Dan Garry dgarry@wikimedia.org wrote:
On 19 March 2015 at 13:19, Oliver Keyes okeyes@wikimedia.org wrote:
Wait. So, sections=all is only /sometimes/ a pageview? ;) When?
If we're talking about Android, it's my understanding that nothing has changed from before. sections=0 is a page view. sections=all is not
because
it's used for saved pages updates.
If we're talking about iOS, as of right now, both sections=all or
sections=0
are page views. Neither kind of query is used for anything else,
because
saved pages updates are not in production.
Our next release for iOS, targeted for 30th March, will include a saved pages update feature. I'm unsure of the structure of the query used to refresh saved pages. The iOS tech lead, Adam, should be able to shed
light
on this.
Dan
-- Dan Garry Associate Product Manager, Mobile Apps Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Okay. So, to summarise:
1. Android uses sections=all for refreshes 2. iOS uses sections=all for pageviews, generally 3. iOS is shortly to also use sections=all for refreshes
...1, 2 and 3 will all look the same, minus UA differences between (1) and (2,3)
IOW, there is shortly to be no visible difference, from server-side, between a "refresh" and a "pageview", for iOS users. Do I understand correctly? If so, I have some ideas for how we could mitigate this problem and make pageviews viable again *purses fingers*.
On 19 March 2015 at 21:26, Adam Baso abaso@wikimedia.org wrote:
Oh, and the patch that introduced the sections=all stuff...here's the change from what I recall.
https://gerrit.wikimedia.org/r/#/c/176491/2/wikipedia/Networking/Fetchers/Ar...
+Bernd, Dmitry, in case they weren't added on earlier email from some reason. Please see below. +Monte
On Thu, Mar 19, 2015 at 6:18 PM, Adam Baso abaso@wikimedia.org wrote:
I've added Dmitry and Bernd to this thread. Guys, you probably want to get on this list if you're not already on it.
Here's the iOS part for refreshing (patches 1, 2, 3, 4). Suppose I have "Derek Charke" and "Darya Khan railway station" in my Saved pages list. Then when I tap the refresh button, the following requests are made automatically, with a User-Agent header of the following form:
User-Agent: WikipediaApp/4.0.7 (iPhone OS 8.2; Phone)
Requests...these use the ArticleFetcher class, which is the same class used for fetching articles even outside of a Saved pages refresh (i.e., garden variety article browsing).
Derek Charke
https://en.m.wikipedia.org/w/api.php?action=mobileview&appInstallID=<somevalue>&format=json&noheadings=true&page=Derek%20Charke&prop=sections%7Ctext%7Clastmodified%7Clastmodifiedby%7Clanguagecount%7Cid%7Cprotection%7Ceditable%7Cdisplaytitle%7Cthumb%7Cdescription%7Cimage§ionprop=toclevel%7Cline%7Canchor%7Clevel%7Cnumber%7Cfromtitle%7Cindex§ions=all&thumbwidth=640
Darya Khan railway station
https://en.m.wikipedia.org/w/api.php?action=mobileview&appInstallID=<somevalue>&format=json&noheadings=true&page=Darya%20Khan%20railway%20station&prop=sections%7Ctext%7Clastmodified%7Clastmodifiedby%7Clanguagecount%7Cid%7Cprotection%7Ceditable%7Cdisplaytitle%7Cthumb%7Cdescription%7Cimage§ionprop=toclevel%7Cline%7Canchor%7Clevel%7Cnumber%7Cfromtitle%7Cindex§ions=all&thumbwidth=640
On Thu, Mar 19, 2015 at 1:36 PM, Oliver Keyes okeyes@wikimedia.org wrote:
(What I'd like from the Apps team to make this as efficient as possible:
- Example URLs/UAs for each possible permutation
{{sections=0,sections=all}{iOS/Android}}; 2. An idea (as suggested by Dan) of what the auto-update will mean. Is Adam on this list or should I reach out distinctly?) 3. Links to the pertinent gerrit patches that instituted the auto-update thing and the iOS bugfix, so I can establish (in the logs of the pageview def update) what the timeline is/was like here. )
On 19 March 2015 at 16:33, Oliver Keyes okeyes@wikimedia.org wrote:
Okay. So, we're going to have to add an element of UA detection, then! That...should be doable pretty trivially.
I have to make some changes to just this function for the session analysis (namely, breaking it out of its current scope and writing a UDF around it) so I'll incorporate this work into that.
On 19 March 2015 at 16:32, Dan Garry dgarry@wikimedia.org wrote:
On 19 March 2015 at 13:19, Oliver Keyes okeyes@wikimedia.org wrote:
Wait. So, sections=all is only /sometimes/ a pageview? ;) When?
If we're talking about Android, it's my understanding that nothing has changed from before. sections=0 is a page view. sections=all is not because it's used for saved pages updates.
If we're talking about iOS, as of right now, both sections=all or sections=0 are page views. Neither kind of query is used for anything else, because saved pages updates are not in production.
Our next release for iOS, targeted for 30th March, will include a saved pages update feature. I'm unsure of the structure of the query used to refresh saved pages. The iOS tech lead, Adam, should be able to shed light on this.
Dan
-- Dan Garry Associate Product Manager, Mobile Apps Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On 19 March 2015 at 19:06, Oliver Keyes okeyes@wikimedia.org wrote:
Okay. So, to summarise:
- Android uses sections=all for refreshes
- iOS uses sections=all for pageviews, generally
3. iOS is shortly to also use sections=all for refreshes
Yes, with the exception that there will always be some people who have not updated their iOS app and will continue to use section=0 for their page views. Alas, it is an app, so this happens.
...1, 2 and 3 will all look the same, minus UA differences between (1) and (2,3)
IOW, there is shortly to be no visible difference, from server-side, between a "refresh" and a "pageview", for iOS users. Do I understand correctly? If so, I have some ideas for how we could mitigate this problem and make pageviews viable again *purses fingers*.
I believe so.
What are your ideas?
Dan
As I see it, we basically have two possibilities here:
1. Make the URLs distinguishable; 2. Add additional metadata in a non-URL place
1 is undesirable because it ruins caching, and we like caching. So we look at 2, which realistically means the x_analytics field. Why don't we add a parameter there? refresh=1. And then, our app check boils down to (in pseudocode):
if(other_checks & urlContains("sections=(0|all)" & !xAnalyticsContains("refresh")){ return true; } return false;
Nice and simple and easy. It'll require some coordination with Ottomata because it means modifying the UDF parameters, and we're using said UDF in production so it'll have to be synced to a change in the relevant Oozie job, but it should be totally doable, and I can't see an easier way of doing it.
Thoughts, people?
On 19 March 2015 at 22:37, Dan Garry dgarry@wikimedia.org wrote:
On 19 March 2015 at 19:06, Oliver Keyes okeyes@wikimedia.org wrote:
Okay. So, to summarise:
Android uses sections=all for refreshes
iOS uses sections=all for pageviews, generally
iOS is shortly to also use sections=all for refreshes
Yes, with the exception that there will always be some people who have not updated their iOS app and will continue to use section=0 for their page views. Alas, it is an app, so this happens.
...1, 2 and 3 will all look the same, minus UA differences between (1) and (2,3)
IOW, there is shortly to be no visible difference, from server-side, between a "refresh" and a "pageview", for iOS users. Do I understand correctly? If so, I have some ideas for how we could mitigate this problem and make pageviews viable again *purses fingers*.
I believe so.
What are your ideas?
Dan
-- Dan Garry Associate Product Manager, Mobile Apps Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Thu, Mar 19, 2015 at 8:06 PM, Oliver Keyes okeyes@wikimedia.org wrote:
As I see it, we basically have two possibilities here:
- Make the URLs distinguishable;
- Add additional metadata in a non-URL place
1 is undesirable because it ruins caching, and we like caching. So we look at 2, which realistically means the x_analytics field. Why don't we add a parameter there? refresh=1. And then, our app check boils down to (in pseudocode):
if(other_checks & urlContains("sections=(0|all)" & !xAnalyticsContains("refresh")){ return true; } return false;
Nice and simple and easy. It'll require some coordination with Ottomata because it means modifying the UDF parameters, and we're using said UDF in production so it'll have to be synced to a change in the relevant Oozie job, but it should be totally doable, and I can't see an easier way of doing it.
Thoughts, people?
Why use anything other than X-Analytics at all? Source of the pageview is exactly the kind of information it is meant for. Just set source=AndroidAppPageView / source=IosAppSectionView / source=AndroidAppSavedPageRefresh etc. (Or set up a mapping to three-letter acronyms if you want to be nice on the servers.) That puts logging completely in the hands of the app developers, so there are no information flow problems and less organizational overhead; it also makes rules more explicit (and thus harder to mess up / easier to spot errors).
In the long-term we'll just be relying on x_analytics, yes, because the app will be sending through the pageID and namespace in the same way as desktop and the mobile web do; so it'll be pageid=50;ns=0;refresh-1 or pageid=50;ns=0, respectively. That's a different patch. In the short-term I'm not seeing how building out a complex ecosystem for this is a valuable use of time given that we know what we'll be switching to (and that it's standardised not just across apps, but also for desktop and the mobile web, to boot). All we care about distinguishing that we won't get for free as part of already-scheduled work is refreshes from pageviews.
On 19 March 2015 at 23:59, Gergo Tisza gtisza@wikimedia.org wrote:
On Thu, Mar 19, 2015 at 8:06 PM, Oliver Keyes okeyes@wikimedia.org wrote:
As I see it, we basically have two possibilities here:
- Make the URLs distinguishable;
- Add additional metadata in a non-URL place
1 is undesirable because it ruins caching, and we like caching. So we look at 2, which realistically means the x_analytics field. Why don't we add a parameter there? refresh=1. And then, our app check boils down to (in pseudocode):
if(other_checks & urlContains("sections=(0|all)" & !xAnalyticsContains("refresh")){ return true; } return false;
Nice and simple and easy. It'll require some coordination with Ottomata because it means modifying the UDF parameters, and we're using said UDF in production so it'll have to be synced to a change in the relevant Oozie job, but it should be totally doable, and I can't see an easier way of doing it.
Thoughts, people?
Why use anything other than X-Analytics at all? Source of the pageview is exactly the kind of information it is meant for. Just set source=AndroidAppPageView / source=IosAppSectionView / source=AndroidAppSavedPageRefresh etc. (Or set up a mapping to three-letter acronyms if you want to be nice on the servers.) That puts logging completely in the hands of the app developers, so there are no information flow problems and less organizational overhead; it also makes rules more explicit (and thus harder to mess up / easier to spot errors).
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Just a note that I'm started on a related patch now, using the (we now know, not-futureproof) logic of:
1. if it's got sections=0 it's a pageview; 2. if it's got sections=all and is from iOS it's a pageview.
Would appreciate some feedback on the idea of sending refresh=1 in the x_analytics header, so we know what to expect there.
On 20 March 2015 at 00:44, Oliver Keyes okeyes@wikimedia.org wrote:
In the long-term we'll just be relying on x_analytics, yes, because the app will be sending through the pageID and namespace in the same way as desktop and the mobile web do; so it'll be pageid=50;ns=0;refresh-1 or pageid=50;ns=0, respectively. That's a different patch. In the short-term I'm not seeing how building out a complex ecosystem for this is a valuable use of time given that we know what we'll be switching to (and that it's standardised not just across apps, but also for desktop and the mobile web, to boot). All we care about distinguishing that we won't get for free as part of already-scheduled work is refreshes from pageviews.
On 19 March 2015 at 23:59, Gergo Tisza gtisza@wikimedia.org wrote:
On Thu, Mar 19, 2015 at 8:06 PM, Oliver Keyes okeyes@wikimedia.org wrote:
As I see it, we basically have two possibilities here:
- Make the URLs distinguishable;
- Add additional metadata in a non-URL place
1 is undesirable because it ruins caching, and we like caching. So we look at 2, which realistically means the x_analytics field. Why don't we add a parameter there? refresh=1. And then, our app check boils down to (in pseudocode):
if(other_checks & urlContains("sections=(0|all)" & !xAnalyticsContains("refresh")){ return true; } return false;
Nice and simple and easy. It'll require some coordination with Ottomata because it means modifying the UDF parameters, and we're using said UDF in production so it'll have to be synced to a change in the relevant Oozie job, but it should be totally doable, and I can't see an easier way of doing it.
Thoughts, people?
Why use anything other than X-Analytics at all? Source of the pageview is exactly the kind of information it is meant for. Just set source=AndroidAppPageView / source=IosAppSectionView / source=AndroidAppSavedPageRefresh etc. (Or set up a mapping to three-letter acronyms if you want to be nice on the servers.) That puts logging completely in the hands of the app developers, so there are no information flow problems and less organizational overhead; it also makes rules more explicit (and thus harder to mess up / easier to spot errors).
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
Either use of a distinct header (e.g., X-WMF-Refresh: 1) or use of a distinct parameter (e.g., wprov=rfsi1 for "refresh from saved pages on iOS, version 1") from the client would work. The use of a distinct parameter is much easier.
Then the VCL code could be updated to look for the field and enrich the X-Analytics header accordingly.
Okay?
On Fri, Mar 20, 2015 at 11:37 AM, Oliver Keyes okeyes@wikimedia.org wrote:
Just a note that I'm started on a related patch now, using the (we now know, not-futureproof) logic of:
- if it's got sections=0 it's a pageview;
- if it's got sections=all and is from iOS it's a pageview.
Would appreciate some feedback on the idea of sending refresh=1 in the x_analytics header, so we know what to expect there.
On 20 March 2015 at 00:44, Oliver Keyes okeyes@wikimedia.org wrote:
In the long-term we'll just be relying on x_analytics, yes, because the app will be sending through the pageID and namespace in the same way as desktop and the mobile web do; so it'll be pageid=50;ns=0;refresh-1 or pageid=50;ns=0, respectively. That's a different patch. In the short-term I'm not seeing how building out a complex ecosystem for this is a valuable use of time given that we know what we'll be switching to (and that it's standardised not just across apps, but also for desktop and the mobile web, to boot). All we care about distinguishing that we won't get for free as part of already-scheduled work is refreshes from pageviews.
On 19 March 2015 at 23:59, Gergo Tisza gtisza@wikimedia.org wrote:
On Thu, Mar 19, 2015 at 8:06 PM, Oliver Keyes okeyes@wikimedia.org
wrote:
As I see it, we basically have two possibilities here:
- Make the URLs distinguishable;
- Add additional metadata in a non-URL place
1 is undesirable because it ruins caching, and we like caching. So we look at 2, which realistically means the x_analytics field. Why don't we add a parameter there? refresh=1. And then, our app check boils down to (in pseudocode):
if(other_checks & urlContains("sections=(0|all)" & !xAnalyticsContains("refresh")){ return true; } return false;
Nice and simple and easy. It'll require some coordination with Ottomata because it means modifying the UDF parameters, and we're using said UDF in production so it'll have to be synced to a change in the relevant Oozie job, but it should be totally doable, and I can't see an easier way of doing it.
Thoughts, people?
Why use anything other than X-Analytics at all? Source of the pageview
is
exactly the kind of information it is meant for. Just set source=AndroidAppPageView / source=IosAppSectionView / source=AndroidAppSavedPageRefresh etc. (Or set up a mapping to
three-letter
acronyms if you want to be nice on the servers.) That puts logging completely in the hands of the app developers, so there are no
information
flow problems and less organizational overhead; it also makes rules more explicit (and thus harder to mess up / easier to spot errors).
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Sounds good for me, although I'd just go for the first option. There's nothing contained in the second option not found in the first that we need (that is: yes, it has device version and OS and all of that, which we need for other things, but then so does the user agent, so it's probably extraneous work to include all of that a /second/ time)
On 20 March 2015 at 15:42, Adam Baso abaso@wikimedia.org wrote:
Either use of a distinct header (e.g., X-WMF-Refresh: 1) or use of a distinct parameter (e.g., wprov=rfsi1 for "refresh from saved pages on iOS, version 1") from the client would work. The use of a distinct parameter is much easier.
Then the VCL code could be updated to look for the field and enrich the X-Analytics header accordingly.
Okay?
On Fri, Mar 20, 2015 at 11:37 AM, Oliver Keyes okeyes@wikimedia.org wrote:
Just a note that I'm started on a related patch now, using the (we now know, not-futureproof) logic of:
- if it's got sections=0 it's a pageview;
- if it's got sections=all and is from iOS it's a pageview.
Would appreciate some feedback on the idea of sending refresh=1 in the x_analytics header, so we know what to expect there.
On 20 March 2015 at 00:44, Oliver Keyes okeyes@wikimedia.org wrote:
In the long-term we'll just be relying on x_analytics, yes, because the app will be sending through the pageID and namespace in the same way as desktop and the mobile web do; so it'll be pageid=50;ns=0;refresh-1 or pageid=50;ns=0, respectively. That's a different patch. In the short-term I'm not seeing how building out a complex ecosystem for this is a valuable use of time given that we know what we'll be switching to (and that it's standardised not just across apps, but also for desktop and the mobile web, to boot). All we care about distinguishing that we won't get for free as part of already-scheduled work is refreshes from pageviews.
On 19 March 2015 at 23:59, Gergo Tisza gtisza@wikimedia.org wrote:
On Thu, Mar 19, 2015 at 8:06 PM, Oliver Keyes okeyes@wikimedia.org wrote:
As I see it, we basically have two possibilities here:
- Make the URLs distinguishable;
- Add additional metadata in a non-URL place
1 is undesirable because it ruins caching, and we like caching. So we look at 2, which realistically means the x_analytics field. Why don't we add a parameter there? refresh=1. And then, our app check boils down to (in pseudocode):
if(other_checks & urlContains("sections=(0|all)" & !xAnalyticsContains("refresh")){ return true; } return false;
Nice and simple and easy. It'll require some coordination with Ottomata because it means modifying the UDF parameters, and we're using said UDF in production so it'll have to be synced to a change in the relevant Oozie job, but it should be totally doable, and I can't see an easier way of doing it.
Thoughts, people?
Why use anything other than X-Analytics at all? Source of the pageview is exactly the kind of information it is meant for. Just set source=AndroidAppPageView / source=IosAppSectionView / source=AndroidAppSavedPageRefresh etc. (Or set up a mapping to three-letter acronyms if you want to be nice on the servers.) That puts logging completely in the hands of the app developers, so there are no information flow problems and less organizational overhead; it also makes rules more explicit (and thus harder to mess up / easier to spot errors).
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Adam: Thanks for including us Android devs. I think a distinct header is probably preferable.
Oliver: I think you already got this, so just to confirm: for Android you can use sections=0 for counting pageviews, and sections=all for counting saved page refreshes.
As a heads-up we're experimenting with Node.js/RESTBase services. Not sure when and if those will be used in production. We're at an early stage. Just wanted to mention that this since it will change the way we request pages from the server significantly (actually it would be from different servers, too). That's probably another pro for using HTTP request headers.
The downside is that we're not using this special request header yet.
Bernd
On Fri, Mar 20, 2015 at 3:17 PM, Oliver Keyes okeyes@wikimedia.org wrote:
Sounds good for me, although I'd just go for the first option. There's nothing contained in the second option not found in the first that we need (that is: yes, it has device version and OS and all of that, which we need for other things, but then so does the user agent, so it's probably extraneous work to include all of that a /second/ time)
On 20 March 2015 at 15:42, Adam Baso abaso@wikimedia.org wrote:
Either use of a distinct header (e.g., X-WMF-Refresh: 1) or use of a distinct parameter (e.g., wprov=rfsi1 for "refresh from saved pages on
iOS,
version 1") from the client would work. The use of a distinct parameter
is
much easier.
Then the VCL code could be updated to look for the field and enrich the X-Analytics header accordingly.
Okay?
On Fri, Mar 20, 2015 at 11:37 AM, Oliver Keyes okeyes@wikimedia.org
wrote:
Just a note that I'm started on a related patch now, using the (we now know, not-futureproof) logic of:
- if it's got sections=0 it's a pageview;
- if it's got sections=all and is from iOS it's a pageview.
Would appreciate some feedback on the idea of sending refresh=1 in the x_analytics header, so we know what to expect there.
On 20 March 2015 at 00:44, Oliver Keyes okeyes@wikimedia.org wrote:
In the long-term we'll just be relying on x_analytics, yes, because the app will be sending through the pageID and namespace in the same way as desktop and the mobile web do; so it'll be pageid=50;ns=0;refresh-1 or pageid=50;ns=0, respectively. That's a different patch. In the short-term I'm not seeing how building out a complex ecosystem for this is a valuable use of time given that we know what we'll be switching to (and that it's standardised not just across apps, but also for desktop and the mobile web, to boot). All we care about distinguishing that we won't get for free as part of already-scheduled work is refreshes from pageviews.
On 19 March 2015 at 23:59, Gergo Tisza gtisza@wikimedia.org wrote:
On Thu, Mar 19, 2015 at 8:06 PM, Oliver Keyes okeyes@wikimedia.org wrote:
As I see it, we basically have two possibilities here:
- Make the URLs distinguishable;
- Add additional metadata in a non-URL place
1 is undesirable because it ruins caching, and we like caching. So
we
look at 2, which realistically means the x_analytics field. Why
don't
we add a parameter there? refresh=1. And then, our app check boils down to (in pseudocode):
if(other_checks & urlContains("sections=(0|all)" & !xAnalyticsContains("refresh")){ return true; } return false;
Nice and simple and easy. It'll require some coordination with Ottomata because it means modifying the UDF parameters, and we're using said UDF in production so it'll have to be synced to a change
in
the relevant Oozie job, but it should be totally doable, and I can't see an easier way of doing it.
Thoughts, people?
Why use anything other than X-Analytics at all? Source of the
pageview
is exactly the kind of information it is meant for. Just set source=AndroidAppPageView / source=IosAppSectionView / source=AndroidAppSavedPageRefresh etc. (Or set up a mapping to three-letter acronyms if you want to be nice on the servers.) That puts logging completely in the hands of the app developers, so there are no information flow problems and less organizational overhead; it also makes rules more explicit (and thus harder to mess up / easier to spot errors).
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
Okay. So, distinct header, registering refresh=1 or [nothing] in the x-analytics field when it hits the varnish layer? Rocking! Let me know what happens/where it goes/when it goes/etc.
The RESTbase services, yeah; they're going to make an impact. Not just in format either (I have literally no idea how the caching setup there works, if there is a caching setup, or which varnish cluster any such caching setup would go through and so if we're even picking up those requests at all).
On 20 March 2015 at 19:11, Bernd Sitzmann bernd@wikimedia.org wrote:
Adam: Thanks for including us Android devs. I think a distinct header is probably preferable.
Oliver: I think you already got this, so just to confirm: for Android you can use sections=0 for counting pageviews, and sections=all for counting saved page refreshes.
As a heads-up we're experimenting with Node.js/RESTBase services. Not sure when and if those will be used in production. We're at an early stage. Just wanted to mention that this since it will change the way we request pages from the server significantly (actually it would be from different servers, too). That's probably another pro for using HTTP request headers.
The downside is that we're not using this special request header yet.
Bernd
On Fri, Mar 20, 2015 at 3:17 PM, Oliver Keyes okeyes@wikimedia.org wrote:
Sounds good for me, although I'd just go for the first option. There's nothing contained in the second option not found in the first that we need (that is: yes, it has device version and OS and all of that, which we need for other things, but then so does the user agent, so it's probably extraneous work to include all of that a /second/ time)
On 20 March 2015 at 15:42, Adam Baso abaso@wikimedia.org wrote:
Either use of a distinct header (e.g., X-WMF-Refresh: 1) or use of a distinct parameter (e.g., wprov=rfsi1 for "refresh from saved pages on iOS, version 1") from the client would work. The use of a distinct parameter is much easier.
Then the VCL code could be updated to look for the field and enrich the X-Analytics header accordingly.
Okay?
On Fri, Mar 20, 2015 at 11:37 AM, Oliver Keyes okeyes@wikimedia.org wrote:
Just a note that I'm started on a related patch now, using the (we now know, not-futureproof) logic of:
- if it's got sections=0 it's a pageview;
- if it's got sections=all and is from iOS it's a pageview.
Would appreciate some feedback on the idea of sending refresh=1 in the x_analytics header, so we know what to expect there.
On 20 March 2015 at 00:44, Oliver Keyes okeyes@wikimedia.org wrote:
In the long-term we'll just be relying on x_analytics, yes, because the app will be sending through the pageID and namespace in the same way as desktop and the mobile web do; so it'll be pageid=50;ns=0;refresh-1 or pageid=50;ns=0, respectively. That's a different patch. In the short-term I'm not seeing how building out a complex ecosystem for this is a valuable use of time given that we know what we'll be switching to (and that it's standardised not just across apps, but also for desktop and the mobile web, to boot). All we care about distinguishing that we won't get for free as part of already-scheduled work is refreshes from pageviews.
On 19 March 2015 at 23:59, Gergo Tisza gtisza@wikimedia.org wrote:
On Thu, Mar 19, 2015 at 8:06 PM, Oliver Keyes okeyes@wikimedia.org wrote: > > As I see it, we basically have two possibilities here: > > 1. Make the URLs distinguishable; > 2. Add additional metadata in a non-URL place > > 1 is undesirable because it ruins caching, and we like caching. So > we > look at 2, which realistically means the x_analytics field. Why > don't > we add a parameter there? refresh=1. And then, our app check boils > down to (in pseudocode): > > if(other_checks & urlContains("sections=(0|all)" & > !xAnalyticsContains("refresh")){ > return true; > } > return false; > > Nice and simple and easy. It'll require some coordination with > Ottomata because it means modifying the UDF parameters, and we're > using said UDF in production so it'll have to be synced to a change > in > the relevant Oozie job, but it should be totally doable, and I > can't > see an easier way of doing it. > > Thoughts, people?
Why use anything other than X-Analytics at all? Source of the pageview is exactly the kind of information it is meant for. Just set source=AndroidAppPageView / source=IosAppSectionView / source=AndroidAppSavedPageRefresh etc. (Or set up a mapping to three-letter acronyms if you want to be nice on the servers.) That puts logging completely in the hands of the app developers, so there are no information flow problems and less organizational overhead; it also makes rules more explicit (and thus harder to mess up / easier to spot errors).
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
Alrighty; existing implementation updated with https://gerrit.wikimedia.org/r/#/c/198489/ which also exposes the isAppPageview method and associates a UDF with it (Marcel, this means you'll be able to do your stuff. Rockin'!)
Let me know when y'all have the necessary varnish and app changes proposed/committed/merged/etc so I can work on the updates to this method and the documentation in parallel.
On 20 March 2015 at 21:11, Oliver Keyes okeyes@wikimedia.org wrote:
Okay. So, distinct header, registering refresh=1 or [nothing] in the x-analytics field when it hits the varnish layer? Rocking! Let me know what happens/where it goes/when it goes/etc.
The RESTbase services, yeah; they're going to make an impact. Not just in format either (I have literally no idea how the caching setup there works, if there is a caching setup, or which varnish cluster any such caching setup would go through and so if we're even picking up those requests at all).
On 20 March 2015 at 19:11, Bernd Sitzmann bernd@wikimedia.org wrote:
Adam: Thanks for including us Android devs. I think a distinct header is probably preferable.
Oliver: I think you already got this, so just to confirm: for Android you can use sections=0 for counting pageviews, and sections=all for counting saved page refreshes.
As a heads-up we're experimenting with Node.js/RESTBase services. Not sure when and if those will be used in production. We're at an early stage. Just wanted to mention that this since it will change the way we request pages from the server significantly (actually it would be from different servers, too). That's probably another pro for using HTTP request headers.
The downside is that we're not using this special request header yet.
Bernd
On Fri, Mar 20, 2015 at 3:17 PM, Oliver Keyes okeyes@wikimedia.org wrote:
Sounds good for me, although I'd just go for the first option. There's nothing contained in the second option not found in the first that we need (that is: yes, it has device version and OS and all of that, which we need for other things, but then so does the user agent, so it's probably extraneous work to include all of that a /second/ time)
On 20 March 2015 at 15:42, Adam Baso abaso@wikimedia.org wrote:
Either use of a distinct header (e.g., X-WMF-Refresh: 1) or use of a distinct parameter (e.g., wprov=rfsi1 for "refresh from saved pages on iOS, version 1") from the client would work. The use of a distinct parameter is much easier.
Then the VCL code could be updated to look for the field and enrich the X-Analytics header accordingly.
Okay?
On Fri, Mar 20, 2015 at 11:37 AM, Oliver Keyes okeyes@wikimedia.org wrote:
Just a note that I'm started on a related patch now, using the (we now know, not-futureproof) logic of:
- if it's got sections=0 it's a pageview;
- if it's got sections=all and is from iOS it's a pageview.
Would appreciate some feedback on the idea of sending refresh=1 in the x_analytics header, so we know what to expect there.
On 20 March 2015 at 00:44, Oliver Keyes okeyes@wikimedia.org wrote:
In the long-term we'll just be relying on x_analytics, yes, because the app will be sending through the pageID and namespace in the same way as desktop and the mobile web do; so it'll be pageid=50;ns=0;refresh-1 or pageid=50;ns=0, respectively. That's a different patch. In the short-term I'm not seeing how building out a complex ecosystem for this is a valuable use of time given that we know what we'll be switching to (and that it's standardised not just across apps, but also for desktop and the mobile web, to boot). All we care about distinguishing that we won't get for free as part of already-scheduled work is refreshes from pageviews.
On 19 March 2015 at 23:59, Gergo Tisza gtisza@wikimedia.org wrote: > On Thu, Mar 19, 2015 at 8:06 PM, Oliver Keyes okeyes@wikimedia.org > wrote: >> >> As I see it, we basically have two possibilities here: >> >> 1. Make the URLs distinguishable; >> 2. Add additional metadata in a non-URL place >> >> 1 is undesirable because it ruins caching, and we like caching. So >> we >> look at 2, which realistically means the x_analytics field. Why >> don't >> we add a parameter there? refresh=1. And then, our app check boils >> down to (in pseudocode): >> >> if(other_checks & urlContains("sections=(0|all)" & >> !xAnalyticsContains("refresh")){ >> return true; >> } >> return false; >> >> Nice and simple and easy. It'll require some coordination with >> Ottomata because it means modifying the UDF parameters, and we're >> using said UDF in production so it'll have to be synced to a change >> in >> the relevant Oozie job, but it should be totally doable, and I >> can't >> see an easier way of doing it. >> >> Thoughts, people? > > > Why use anything other than X-Analytics at all? Source of the > pageview > is > exactly the kind of information it is meant for. Just set > source=AndroidAppPageView / source=IosAppSectionView / > source=AndroidAppSavedPageRefresh etc. (Or set up a mapping to > three-letter > acronyms if you want to be nice on the servers.) That puts logging > completely in the hands of the app developers, so there are no > information > flow problems and less organizational overhead; it also makes rules > more > explicit (and thus harder to mess up / easier to spot errors). > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics >
-- Oliver Keyes Research Analyst Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
Awesome Oliver, thanks! Having a look.
On Sat, Mar 21, 2015 at 4:32 PM, Oliver Keyes okeyes@wikimedia.org wrote:
Alrighty; existing implementation updated with https://gerrit.wikimedia.org/r/#/c/198489/ which also exposes the isAppPageview method and associates a UDF with it (Marcel, this means you'll be able to do your stuff. Rockin'!)
Let me know when y'all have the necessary varnish and app changes proposed/committed/merged/etc so I can work on the updates to this method and the documentation in parallel.
On 20 March 2015 at 21:11, Oliver Keyes okeyes@wikimedia.org wrote:
Okay. So, distinct header, registering refresh=1 or [nothing] in the x-analytics field when it hits the varnish layer? Rocking! Let me know what happens/where it goes/when it goes/etc.
The RESTbase services, yeah; they're going to make an impact. Not just in format either (I have literally no idea how the caching setup there works, if there is a caching setup, or which varnish cluster any such caching setup would go through and so if we're even picking up those requests at all).
On 20 March 2015 at 19:11, Bernd Sitzmann bernd@wikimedia.org wrote:
Adam: Thanks for including us Android devs. I think a distinct header is probably preferable.
Oliver: I think you already got this, so just to confirm: for Android you can
use
sections=0 for counting pageviews, and sections=all for counting saved
page
refreshes.
As a heads-up we're experimenting with Node.js/RESTBase services. Not
sure
when and if those will be used in production. We're at an early stage.
Just
wanted to mention that this since it will change the way we request
pages
from the server significantly (actually it would be from different
servers,
too). That's probably another pro for using HTTP request headers.
The downside is that we're not using this special request header yet.
Bernd
On Fri, Mar 20, 2015 at 3:17 PM, Oliver Keyes okeyes@wikimedia.org
wrote:
Sounds good for me, although I'd just go for the first option. There's nothing contained in the second option not found in the first that we need (that is: yes, it has device version and OS and all of that, which we need for other things, but then so does the user agent, so it's probably extraneous work to include all of that a /second/ time)
On 20 March 2015 at 15:42, Adam Baso abaso@wikimedia.org wrote:
Either use of a distinct header (e.g., X-WMF-Refresh: 1) or use of a distinct parameter (e.g., wprov=rfsi1 for "refresh from saved pages
on
iOS, version 1") from the client would work. The use of a distinct
parameter
is much easier.
Then the VCL code could be updated to look for the field and enrich
the
X-Analytics header accordingly.
Okay?
On Fri, Mar 20, 2015 at 11:37 AM, Oliver Keyes <okeyes@wikimedia.org
wrote:
Just a note that I'm started on a related patch now, using the (we
now
know, not-futureproof) logic of:
- if it's got sections=0 it's a pageview;
- if it's got sections=all and is from iOS it's a pageview.
Would appreciate some feedback on the idea of sending refresh=1 in
the
x_analytics header, so we know what to expect there.
On 20 March 2015 at 00:44, Oliver Keyes okeyes@wikimedia.org
wrote:
> In the long-term we'll just be relying on x_analytics, yes,
because
> the app will be sending through the pageID and namespace in the
same
> way as desktop and the mobile web do; so it'll be > pageid=50;ns=0;refresh-1 or pageid=50;ns=0, respectively. That's a > different patch. In the short-term I'm not seeing how building
out a
> complex ecosystem for this is a valuable use of time given that we > know what we'll be switching to (and that it's standardised not
just
> across apps, but also for desktop and the mobile web, to boot).
All
> we > care about distinguishing that we won't get for free as part of > already-scheduled work is refreshes from pageviews. > > On 19 March 2015 at 23:59, Gergo Tisza gtisza@wikimedia.org
wrote:
>> On Thu, Mar 19, 2015 at 8:06 PM, Oliver Keyes <
okeyes@wikimedia.org>
>> wrote: >>> >>> As I see it, we basically have two possibilities here: >>> >>> 1. Make the URLs distinguishable; >>> 2. Add additional metadata in a non-URL place >>> >>> 1 is undesirable because it ruins caching, and we like caching.
So
>>> we >>> look at 2, which realistically means the x_analytics field. Why >>> don't >>> we add a parameter there? refresh=1. And then, our app check
boils
>>> down to (in pseudocode): >>> >>> if(other_checks & urlContains("sections=(0|all)" & >>> !xAnalyticsContains("refresh")){ >>> return true; >>> } >>> return false; >>> >>> Nice and simple and easy. It'll require some coordination with >>> Ottomata because it means modifying the UDF parameters, and
we're
>>> using said UDF in production so it'll have to be synced to a
change
>>> in >>> the relevant Oozie job, but it should be totally doable, and I >>> can't >>> see an easier way of doing it. >>> >>> Thoughts, people? >> >> >> Why use anything other than X-Analytics at all? Source of the >> pageview >> is >> exactly the kind of information it is meant for. Just set >> source=AndroidAppPageView / source=IosAppSectionView / >> source=AndroidAppSavedPageRefresh etc. (Or set up a mapping to >> three-letter >> acronyms if you want to be nice on the servers.) That puts
logging
>> completely in the hands of the app developers, so there are no >> information >> flow problems and less organizational overhead; it also makes
rules
>> more >> explicit (and thus harder to mess up / easier to spot errors). >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > > > -- > Oliver Keyes > Research Analyst > Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
-- Oliver Keyes Research Analyst Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics