This (making pageviews proactive) is a great idea, and we should follow through.  Here's a simple start:

If your app/site/etc. is creating a request that it wants to count as a pageview, add an X-Analytics header with pageview_id=<page_id> or pageview_title=<page_title>

If we can make this change uniformly, I think we'd be in a very good place.

On Wed, Aug 19, 2015 at 10:23 AM, Oliver Keyes <okeyes@wikimedia.org> wrote:
On 19 August 2015 at 10:19, Andrew Otto <aotto@wikimedia.org> wrote:
>>  If we /do/ include RESTBase requests we will not only have to
>> rewrite the pageview definition for the apps to recognise the new URL
>> scheme
>
> I really think that apps and APIs should do something proactive to tag or log a pageview.  With more ways of viewing content, it is going to get harder and harder to maintain a pattern based definition.  A pageview should be an event that is logged, not something that is pattern matched out of a very noisy stream of data.
>
> Most mediawiki requests do this now, via the page_id field in the X-Analytlics header, but we can’t use this for all pageviews because APIs are more complicated (e.g. more than one page can be served in a single request, etc.).  In the longterm, there should be a pageview event stream just like rcstream! :)

This is an excellent point. IIRC we'd been asking Apps to do this for
kind of a while, so...

>
> -Ao
>
>
>
>> On Aug 18, 2015, at 19:58, Oliver Keyes <okeyes@wikimedia.org> wrote:
>>
>> On 18 August 2015 at 19:11, Bernd Sitzmann <bernd@wikimedia.org> wrote:
>>> This discussion is about needed updates of the definition and Analytics
>>> implementation for mobile apps page view metrics. There is also an
>>> associated Phab task[4]. Please add the proper Analytics project there.
>>>
>>> Background / Changes
>>>
>>> As you probably remember, the Android app splits a page view into two
>>> requests: one for the lead section and metadata, plus another one for the
>>> remainder.
>>>
>>> The mobile apps are going to change the way they load pages in two different
>>> ways:
>>>
>>> We'll add a link preview when someone clicks on a link from a page.
>>> We're planning on switching over the using RESTBase for loading pages and
>>> also the link preview (initially just the Android beta, ater more)
>>>
>>
>> Woah woah woah woah woah. By RESTBase do you mean Gabriel's RESTful service API?
>>
>> Last time I checked that wasn't even consumed by HDFS. Is it now being
>> consumed by HDFS?
>>
>> More importantly the actual URLs are going to look /totally/
>> different. If we do not include RESTBase requests, we will miss the
>> apps. If we /do/ include RESTBase requests we will not only have to
>> rewrite the pageview definition for the apps to recognise the new URL
>> scheme, we will also potentially have to rewrite every /other/ bit of
>> the definition to /not/ incorporate those requests.
>>
>> (I use "we" in a collective sense. This isn't my baby any more,
>> although if Joseph et al want help with the refactor here I'm happy to
>> spend my volunteer time on it).
>>
>> But basically every other bit of your email is important but now
>> secondary: this is a potentially massive change, all on its own, even
>> without the link preview, even if the substance of the requests going
>> to RESTBase were identical.
>>
>>> This will have implications for the pageviews definition and how we count
>>> user engagement.
>>>
>>> The big question is
>>>
>>> Should we count link previews as a page view since it's an indication of
>>> user engagement? Or should there be a separate metric for link previews?
>>>
>>> Counting page views
>>>
>>> IIRC we currently count action=mobileview&sections=0 query parameters of
>>> api.php as a page view. When we publish link previews for all Android app
>>> users then we would either want to count also the calls to
>>> action=query&prop=extracts as a page view or add them to another metric.
>>>
>>> Once the apps use RESTBase the HTTPS requests will be very different:
>>>
>>> Page view: Instead of action=mobileview&sections=0 the app would call the
>>> RESTBase endpoint for lead request[1] instead of the PHP API mentioned
>>> above. Then it would call [2].
>>> Link preview: Instead of action=query&prop=extracts it would call the lead
>>> request[1], too, since there is a lot of overlap. At least that our current
>>> plan. The advantage of that is that the client doesn't need to execute the
>>> lead request a second time if the user clicks on the link preview (-- either
>>> through caching or app logic.)
>>>
>>> So, in the RESTBase case we either want to count the
>>> mobile-html-sections-lead requests or the mobile-html-sections-remaining
>>> requests depending on what our definition for page views actually is. We
>>> could also add a query parameter or extra HTTP header to one of the
>>> mobile-html-sections-lead requests if we need to distinguish between
>>> previews and page views.
>>>
>>> Both the current PHP API and the RESTBase based metrics would need to be
>>> compatible and be collected in parallel since we cannot control when users
>>> update their apps.
>>>
>>> [1]
>>> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-lead/Dilbert
>>> [2]
>>> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-remaining/Dilbert
>>> [3]
>>> https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_apps
>>>
>>> [4] https://phabricator.wikimedia.org/T109383
>>>
>>>
>>> Cheers,
>>>
>>> Bernd
>>>
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>
>>
>>
>> --
>> Oliver Keyes
>> Count Logula
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics



--
Oliver Keyes
Count Logula
Wikimedia Foundation

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics