It's not that challenging; Aaron and I developed a fairly robust way of doing it that Mikhail and I are refining. It's just not easy to do without, say, a dedicated EL schema that somebody (probably readership?) would own and surface data from.
On 18 September 2015 at 13:14, Gabriel Wicke gwicke@wikimedia.org wrote:
This discussion also reminds me of the idea of tracking time spent on site. Arguably, that's a more relevant measurement for how much of our content people actually consume, and it also neatly side-steps issues like the categorization of link previews. I realize that measuring that accurately can be challenging, but I think it'll become more and more important as we venture into more dynamic content experiences.
On Thu, Sep 17, 2015 at 8:17 AM, Oliver Keyes okeyes@wikimedia.org wrote:
Danke!
On 17 September 2015 at 11:15, Nuria Ruiz nuria@wikimedia.org wrote:
Right! Thanks for pointing that out.
I think I have updated all docs now: https://meta.wikimedia.org/wiki/Research:Page_view#Change_log
https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters
On Thu, Sep 17, 2015 at 7:36 AM, Oliver Keyes okeyes@wikimedia.org wrote:
Have those changes been noted on the main pageview definition page and associated changelog?
On 17 September 2015 at 09:58, Nuria Ruiz nuria@wikimedia.org wrote:
With more ways of viewing content, it is going to get harder and harder to maintain a pattern based definition.
Indeed, we want to move away from pattern based definition as mach as possible.
This is an FYI to everyone that with our latest changes (that we are in the process of deploying today) if a request comes "tagged" with "preview" in the x-analytics header it will not be counted towards a pageviews. The Android App should do corresponding changes to add the tag "preview" to its preview requests.
X-analytics header is documented here: https://wikitech.wikimedia.org/wiki/X-Analytics
On Wed, Aug 19, 2015 at 7:19 AM, Andrew Otto aotto@wikimedia.org wrote:
> If we /do/ include RESTBase requests we will not only have to > rewrite the pageview definition for the apps to recognise the new > URL > scheme
I really think that apps and APIs should do something proactive to tag or log a pageview. With more ways of viewing content, it is going to get harder and harder to maintain a pattern based definition. A pageview should be an event that is logged, not something that is pattern matched out of a very noisy stream of data.
Most mediawiki requests do this now, via the page_id field in the X-Analytlics header, but we can’t use this for all pageviews because APIs are more complicated (e.g. more than one page can be served in a single request, etc.). In the longterm, there should be a pageview event stream just like rcstream! :)
-Ao
> On Aug 18, 2015, at 19:58, Oliver Keyes okeyes@wikimedia.org > wrote: > > On 18 August 2015 at 19:11, Bernd Sitzmann bernd@wikimedia.org > wrote: >> This discussion is about needed updates of the definition and >> Analytics >> implementation for mobile apps page view metrics. There is also >> an >> associated Phab task[4]. Please add the proper Analytics project >> there. >> >> Background / Changes >> >> As you probably remember, the Android app splits a page view into >> two >> requests: one for the lead section and metadata, plus another one >> for >> the >> remainder. >> >> The mobile apps are going to change the way they load pages in >> two >> different >> ways: >> >> We'll add a link preview when someone clicks on a link from a >> page. >> We're planning on switching over the using RESTBase for loading >> pages >> and >> also the link preview (initially just the Android beta, ater >> more) >> > > Woah woah woah woah woah. By RESTBase do you mean Gabriel's > RESTful > service API? > > Last time I checked that wasn't even consumed by HDFS. Is it now > being > consumed by HDFS? > > More importantly the actual URLs are going to look /totally/ > different. If we do not include RESTBase requests, we will miss > the > apps. If we /do/ include RESTBase requests we will not only have > to > rewrite the pageview definition for the apps to recognise the new > URL > scheme, we will also potentially have to rewrite every /other/ bit > of > the definition to /not/ incorporate those requests. > > (I use "we" in a collective sense. This isn't my baby any more, > although if Joseph et al want help with the refactor here I'm > happy > to > spend my volunteer time on it). > > But basically every other bit of your email is important but now > secondary: this is a potentially massive change, all on its own, > even > without the link preview, even if the substance of the requests > going > to RESTBase were identical. > >> This will have implications for the pageviews definition and how >> we >> count >> user engagement. >> >> The big question is >> >> Should we count link previews as a page view since it's an >> indication >> of >> user engagement? Or should there be a separate metric for link >> previews? >> >> Counting page views >> >> IIRC we currently count action=mobileview§ions=0 query >> parameters >> of >> api.php as a page view. When we publish link previews for all >> Android >> app >> users then we would either want to count also the calls to >> action=query&prop=extracts as a page view or add them to another >> metric. >> >> Once the apps use RESTBase the HTTPS requests will be very >> different: >> >> Page view: Instead of action=mobileview§ions=0 the app would >> call >> the >> RESTBase endpoint for lead request[1] instead of the PHP API >> mentioned >> above. Then it would call [2]. >> Link preview: Instead of action=query&prop=extracts it would call >> the >> lead >> request[1], too, since there is a lot of overlap. At least that >> our >> current >> plan. The advantage of that is that the client doesn't need to >> execute >> the >> lead request a second time if the user clicks on the link preview >> (-- >> either >> through caching or app logic.) >> >> So, in the RESTBase case we either want to count the >> mobile-html-sections-lead requests or the >> mobile-html-sections-remaining >> requests depending on what our definition for page views actually >> is. >> We >> could also add a query parameter or extra HTTP header to one of >> the >> mobile-html-sections-lead requests if we need to distinguish >> between >> previews and page views. >> >> Both the current PHP API and the RESTBase based metrics would >> need >> to >> be >> compatible and be collected in parallel since we cannot control >> when >> users >> update their apps. >> >> [1] >> >> >> >> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-lead/Dilbert >> [2] >> >> >> >> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-remaining/Dil... >> [3] >> >> >> >> https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_app... >> >> [4] https://phabricator.wikimedia.org/T109383 >> >> >> Cheers, >> >> Bernd >> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > > > -- > Oliver Keyes > Count Logula > Wikimedia Foundation > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Count Logula Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Count Logula Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Gabriel Wicke Principal Engineer, Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics