Good, so page_id if you have it, page_title if not. We'll work around the foundation (I'll talk about this at Scrum of Scrums in an hour) until everyone respects this convention, and then we'll change the pageview definition to use it. Sounds like the beginning of a beautiful thing :)
On Wed, Aug 19, 2015 at 12:33 PM, Oliver Keyes okeyes@wikimedia.org wrote:
On 19 August 2015 at 12:29, Bernd Sitzmann bernd@wikimedia.org wrote:
Andrew,
Are you saying the apps have the option to skip providing one of
page_title
or page_id? I hope this is the case since I just came up with a scheme where we could avoid the second request when a page has only a single section, which we already get through the first (lead) request.
Yep; you'd need to provide one or the other, not both. We're actually already looking for sections=0 due precisely to this (that there are two requests for one page) so only including the page_title there should not mess with the continuity of data.
Yes to what Oliver said: The apps don't always know the page_id ahead of time (only sometimes). The best example where we don't know the page_id ahead of time is when someone searches for a term on Google search on an Android device, and gets directed to our Android app. The app only gets
the
URL of the page, which we then take to derive the wiki and page_title
from.
Bernd
On Wed, Aug 19, 2015 at 10:24 AM, Oliver Keyes okeyes@wikimedia.org
wrote:
It'll need to be, some requests don't know pageID in advance, which I think was the reason Apps initially didn't implement this.
On 19 August 2015 at 12:19, Andrew Otto aotto@wikimedia.org wrote:
If your app/site/etc. is creating a request that it wants to count as
a
pageview, add an X-Analytics header with pageview_id=<page_id> or pageview_title=<page_title>
page_id is the current key, so let’s keep that. page_title would be good to have too. Let’s make it an and/or.
On Aug 19, 2015, at 12:17, Bernd Sitzmann bernd@wikimedia.org
wrote:
If your app/site/etc. is creating a request that it wants to count
as a
pageview, add an X-Analytics header with pageview_id=<page_id> or pageview_title=<page_title>
Ideally the page id would be the way to go. From a client's
perspective
I prefer the page title since clients don't always know the page id
ahead
of time. (We could put that header into the second request of loading the page but I cannot guarantee that we we will always have a second request in the future.)
--Cheers, Bernd
On Wed, Aug 19, 2015 at 8:53 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
This (making pageviews proactive) is a great idea, and we should
follow
through. Here's a simple start:
If your app/site/etc. is creating a request that it wants to count
as a
pageview, add an X-Analytics header with pageview_id=<page_id> or pageview_title=<page_title>
If we can make this change uniformly, I think we'd be in a very good place.
On Wed, Aug 19, 2015 at 10:23 AM, Oliver Keyes <okeyes@wikimedia.org
wrote:
On 19 August 2015 at 10:19, Andrew Otto aotto@wikimedia.org
wrote:
>> If we /do/ include RESTBase requests we will not only have to >> rewrite the pageview definition for the apps to recognise the new >> URL >> scheme > > I really think that apps and APIs should do something proactive to > tag > or log a pageview. With more ways of viewing content, it is going > to get > harder and harder to maintain a pattern based definition. A > pageview should > be an event that is logged, not something that is pattern matched > out of a > very noisy stream of data. > > Most mediawiki requests do this now, via the page_id field in the > X-Analytlics header, but we can’t use this for all pageviews
because
> APIs > are more complicated (e.g. more than one page can be served in a > single > request, etc.). In the longterm, there should be a pageview event > stream > just like rcstream! :)
This is an excellent point. IIRC we'd been asking Apps to do this
for
kind of a while, so...
> > -Ao > > > >> On Aug 18, 2015, at 19:58, Oliver Keyes okeyes@wikimedia.org >> wrote: >> >> On 18 August 2015 at 19:11, Bernd Sitzmann bernd@wikimedia.org >> wrote: >>> This discussion is about needed updates of the definition and >>> Analytics >>> implementation for mobile apps page view metrics. There is also
an
>>> associated Phab task[4]. Please add the proper Analytics project >>> there. >>> >>> Background / Changes >>> >>> As you probably remember, the Android app splits a page view
into
>>> two >>> requests: one for the lead section and metadata, plus another
one
>>> for >>> the >>> remainder. >>> >>> The mobile apps are going to change the way they load pages in
two
>>> different >>> ways: >>> >>> We'll add a link preview when someone clicks on a link from a >>> page. >>> We're planning on switching over the using RESTBase for loading >>> pages >>> and >>> also the link preview (initially just the Android beta, ater
more)
>>> >> >> Woah woah woah woah woah. By RESTBase do you mean Gabriel's
RESTful
>> service API? >> >> Last time I checked that wasn't even consumed by HDFS. Is it now >> being >> consumed by HDFS? >> >> More importantly the actual URLs are going to look /totally/ >> different. If we do not include RESTBase requests, we will miss
the
>> apps. If we /do/ include RESTBase requests we will not only have
to
>> rewrite the pageview definition for the apps to recognise the new >> URL >> scheme, we will also potentially have to rewrite every /other/
bit
>> of >> the definition to /not/ incorporate those requests. >> >> (I use "we" in a collective sense. This isn't my baby any more, >> although if Joseph et al want help with the refactor here I'm
happy
>> to >> spend my volunteer time on it). >> >> But basically every other bit of your email is important but now >> secondary: this is a potentially massive change, all on its own, >> even >> without the link preview, even if the substance of the requests >> going >> to RESTBase were identical. >> >>> This will have implications for the pageviews definition and how >>> we >>> count >>> user engagement. >>> >>> The big question is >>> >>> Should we count link previews as a page view since it's an >>> indication >>> of >>> user engagement? Or should there be a separate metric for link >>> previews? >>> >>> Counting page views >>> >>> IIRC we currently count action=mobileview§ions=0 query >>> parameters >>> of >>> api.php as a page view. When we publish link previews for all >>> Android >>> app >>> users then we would either want to count also the calls to >>> action=query&prop=extracts as a page view or add them to another >>> metric. >>> >>> Once the apps use RESTBase the HTTPS requests will be very >>> different: >>> >>> Page view: Instead of action=mobileview§ions=0 the app would >>> call >>> the >>> RESTBase endpoint for lead request[1] instead of the PHP API >>> mentioned >>> above. Then it would call [2]. >>> Link preview: Instead of action=query&prop=extracts it would
call
>>> the >>> lead >>> request[1], too, since there is a lot of overlap. At least that >>> our >>> current >>> plan. The advantage of that is that the client doesn't need to >>> execute the >>> lead request a second time if the user clicks on the link
preview
>>> (-- >>> either >>> through caching or app logic.) >>> >>> So, in the RESTBase case we either want to count the >>> mobile-html-sections-lead requests or the >>> mobile-html-sections-remaining >>> requests depending on what our definition for page views
actually
>>> is. >>> We >>> could also add a query parameter or extra HTTP header to one of >>> the >>> mobile-html-sections-lead requests if we need to distinguish >>> between >>> previews and page views. >>> >>> Both the current PHP API and the RESTBase based metrics would
need
>>> to >>> be >>> compatible and be collected in parallel since we cannot control >>> when >>> users >>> update their apps. >>> >>> [1] >>> >>> >>>
https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-lead/Dilbert
>>> [2] >>> >>> >>>
https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-remaining/Dil...
>>> [3] >>> >>> >>>
https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_app...
>>> >>> [4] https://phabricator.wikimedia.org/T109383 >>> >>> >>> Cheers, >>> >>> Bernd >>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >> >> >> >> -- >> Oliver Keyes >> Count Logula >> Wikimedia Foundation >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Count Logula Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Count Logula Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Count Logula Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics