On 19 August 2015 at 12:29, Bernd Sitzmann <bernd@wikimedia.org> wrote:
> Andrew,
>
> Are you saying the apps have the option to skip providing one of page_title
> or page_id?
> I hope this is the case since I just came up with a scheme where we could
> avoid the second request when a page has only a single section, which we
> already get through the first (lead) request.
>
Yep; you'd need to provide one or the other, not both. We're actually
already looking for sections=0 due precisely to this (that there are
two requests for one page) so only including the page_title there
should not mess with the continuity of data.
> Yes to what Oliver said: The apps don't always know the page_id ahead of
> time (only sometimes). The best example where we don't know the page_id
> ahead of time is when someone searches for a term on Google search on an
> Android device, and gets directed to our Android app. The app only gets the
> URL of the page, which we then take to derive the wiki and page_title from.
>
> Bernd
>
> On Wed, Aug 19, 2015 at 10:24 AM, Oliver Keyes <okeyes@wikimedia.org> wrote:
>>
>> It'll need to be, some requests don't know pageID in advance, which I
>> think was the reason Apps initially didn't implement this.
>>
>> On 19 August 2015 at 12:19, Andrew Otto <aotto@wikimedia.org> wrote:
>> > If your app/site/etc. is creating a request that it wants to count as a
>> > pageview, add an X-Analytics header with pageview_id=<page_id> or
>> > pageview_title=<page_title>
>> >
>> >
>> > page_id is the current key, so let’s keep that. page_title would be
>> > good to
>> > have too. Let’s make it an and/or.
>> >
>> >
>> > On Aug 19, 2015, at 12:17, Bernd Sitzmann <bernd@wikimedia.org> wrote:
>> >
>> >> If your app/site/etc. is creating a request that it wants to count as a
>> >> pageview, add an X-Analytics header with pageview_id=<page_id> or
>> >> pageview_title=<page_title>
>> >
>> >
>> > Ideally the page id would be the way to go. From a client's perspective
>> > I
>> > prefer the page title since clients don't always know the page id ahead
>> > of
>> > time. (We could put that header into the second request of loading the
>> > page
>> > but I cannot guarantee that we we will always have a second request in
>> > the
>> > future.)
>> >
>> > --Cheers,
>> > Bernd
>> >
>> > On Wed, Aug 19, 2015 at 8:53 AM, Dan Andreescu
>> > <dandreescu@wikimedia.org>
>> > wrote:
>> >>
>> >> This (making pageviews proactive) is a great idea, and we should follow
>> >> through. Here's a simple start:
>> >>
>> >> If your app/site/etc. is creating a request that it wants to count as a
>> >> pageview, add an X-Analytics header with pageview_id=<page_id> or
>> >> pageview_title=<page_title>
>> >>
>> >> If we can make this change uniformly, I think we'd be in a very good
>> >> place.
>> >>
>> >> On Wed, Aug 19, 2015 at 10:23 AM, Oliver Keyes <okeyes@wikimedia.org>
>> >> wrote:
>> >>>
>> >>> On 19 August 2015 at 10:19, Andrew Otto <aotto@wikimedia.org> wrote:
>> >>> >> If we /do/ include RESTBase requests we will not only have to
>> >>> >> rewrite the pageview definition for the apps to recognise the new
>> >>> >> URL
>> >>> >> scheme
>> >>> >
>> >>> > I really think that apps and APIs should do something proactive to
>> >>> > tag
>> >>> > or log a pageview. With more ways of viewing content, it is going
>> >>> > to get
>> >>> > harder and harder to maintain a pattern based definition. A
>> >>> > pageview should
>> >>> > be an event that is logged, not something that is pattern matched
>> >>> > out of a
>> >>> > very noisy stream of data.
>> >>> >
>> >>> > Most mediawiki requests do this now, via the page_id field in the
>> >>> > X-Analytlics header, but we can’t use this for all pageviews because
>> >>> > APIs
>> >>> > are more complicated (e.g. more than one page can be served in a
>> >>> > single
>> >>> > request, etc.). In the longterm, there should be a pageview event
>> >>> > stream
>> >>> > just like rcstream! :)
>> >>>
>> >>> This is an excellent point. IIRC we'd been asking Apps to do this for
>> >>> kind of a while, so...
>> >>>
>> >>> >
>> >>> > -Ao
>> >>> >
>> >>> >
>> >>> >
>> >>> >> On Aug 18, 2015, at 19:58, Oliver Keyes <okeyes@wikimedia.org>
>> >>> >> wrote:
>> >>> >>
>> >>> >> On 18 August 2015 at 19:11, Bernd Sitzmann <bernd@wikimedia.org>
>> >>> >> wrote:
>> >>> >>> This discussion is about needed updates of the definition and
>> >>> >>> Analytics
>> >>> >>> implementation for mobile apps page view metrics. There is also an
>> >>> >>> associated Phab task[4]. Please add the proper Analytics project
>> >>> >>> there.
>> >>> >>>
>> >>> >>> Background / Changes
>> >>> >>>
>> >>> >>> As you probably remember, the Android app splits a page view into
>> >>> >>> two
>> >>> >>> requests: one for the lead section and metadata, plus another one
>> >>> >>> for
>> >>> >>> the
>> >>> >>> remainder.
>> >>> >>>
>> >>> >>> The mobile apps are going to change the way they load pages in two
>> >>> >>> different
>> >>> >>> ways:
>> >>> >>>
>> >>> >>> We'll add a link preview when someone clicks on a link from a
>> >>> >>> page.
>> >>> >>> We're planning on switching over the using RESTBase for loading
>> >>> >>> pages
>> >>> >>> and
>> >>> >>> also the link preview (initially just the Android beta, ater more)
>> >>> >>>
>> >>> >>
>> >>> >> Woah woah woah woah woah. By RESTBase do you mean Gabriel's RESTful
>> >>> >> service API?
>> >>> >>
>> >>> >> Last time I checked that wasn't even consumed by HDFS. Is it now
>> >>> >> being
>> >>> >> consumed by HDFS?
>> >>> >>
>> >>> >> More importantly the actual URLs are going to look /totally/
>> >>> >> different. If we do not include RESTBase requests, we will miss the
>> >>> >> apps. If we /do/ include RESTBase requests we will not only have to
>> >>> >> rewrite the pageview definition for the apps to recognise the new
>> >>> >> URL
>> >>> >> scheme, we will also potentially have to rewrite every /other/ bit
>> >>> >> of
>> >>> >> the definition to /not/ incorporate those requests.
>> >>> >>
>> >>> >> (I use "we" in a collective sense. This isn't my baby any more,
>> >>> >> although if Joseph et al want help with the refactor here I'm happy
>> >>> >> to
>> >>> >> spend my volunteer time on it).
>> >>> >>
>> >>> >> But basically every other bit of your email is important but now
>> >>> >> secondary: this is a potentially massive change, all on its own,
>> >>> >> even
>> >>> >> without the link preview, even if the substance of the requests
>> >>> >> going
>> >>> >> to RESTBase were identical.
>> >>> >>
>> >>> >>> This will have implications for the pageviews definition and how
>> >>> >>> we
>> >>> >>> count
>> >>> >>> user engagement.
>> >>> >>>
>> >>> >>> The big question is
>> >>> >>>
>> >>> >>> Should we count link previews as a page view since it's an
>> >>> >>> indication
>> >>> >>> of
>> >>> >>> user engagement? Or should there be a separate metric for link
>> >>> >>> previews?
>> >>> >>>
>> >>> >>> Counting page views
>> >>> >>>
>> >>> >>> IIRC we currently count action=mobileview§ions=0 query
>> >>> >>> parameters
>> >>> >>> of
>> >>> >>> api.php as a page view. When we publish link previews for all
>> >>> >>> Android
>> >>> >>> app
>> >>> >>> users then we would either want to count also the calls to
>> >>> >>> action=query&prop=extracts as a page view or add them to another
>> >>> >>> metric.
>> >>> >>>
>> >>> >>> Once the apps use RESTBase the HTTPS requests will be very
>> >>> >>> different:
>> >>> >>>
>> >>> >>> Page view: Instead of action=mobileview§ions=0 the app would
>> >>> >>> call
>> >>> >>> the
>> >>> >>> RESTBase endpoint for lead request[1] instead of the PHP API
>> >>> >>> mentioned
>> >>> >>> above. Then it would call [2].
>> >>> >>> Link preview: Instead of action=query&prop=extracts it would call
>> >>> >>> the
>> >>> >>> lead
>> >>> >>> request[1], too, since there is a lot of overlap. At least that
>> >>> >>> our
>> >>> >>> current
>> >>> >>> plan. The advantage of that is that the client doesn't need to
>> >>> >>> execute the
>> >>> >>> lead request a second time if the user clicks on the link preview
>> >>> >>> (--
>> >>> >>> either
>> >>> >>> through caching or app logic.)
>> >>> >>>
>> >>> >>> So, in the RESTBase case we either want to count the
>> >>> >>> mobile-html-sections-lead requests or the
>> >>> >>> mobile-html-sections-remaining
>> >>> >>> requests depending on what our definition for page views actually
>> >>> >>> is.
>> >>> >>> We
>> >>> >>> could also add a query parameter or extra HTTP header to one of
>> >>> >>> the
>> >>> >>> mobile-html-sections-lead requests if we need to distinguish
>> >>> >>> between
>> >>> >>> previews and page views.
>> >>> >>>
>> >>> >>> Both the current PHP API and the RESTBase based metrics would need
>> >>> >>> to
>> >>> >>> be
>> >>> >>> compatible and be collected in parallel since we cannot control
>> >>> >>> when
>> >>> >>> users
>> >>> >>> update their apps.
>> >>> >>>
>> >>> >>> [1]
>> >>> >>>
>> >>> >>>
>> >>> >>> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-lead/Dilbert
>> >>> >>> [2]
>> >>> >>>
>> >>> >>>
>> >>> >>> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-remaining/Dilbert
>> >>> >>> [3]
>> >>> >>>
>> >>> >>>
>> >>> >>> https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_apps
>> >>> >>>
>> >>> >>> [4] https://phabricator.wikimedia.org/T109383
>> >>> >>>
>> >>> >>>
>> >>> >>> Cheers,
>> >>> >>>
>> >>> >>> Bernd
>> >>> >>>
>> >>> >>>
>> >>> >>> _______________________________________________
>> >>> >>> Analytics mailing list
>> >>> >>> Analytics@lists.wikimedia.org
>> >>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> Oliver Keyes
>> >>> >> Count Logula
>> >>> >> Wikimedia Foundation
>> >>> >>
>> >>> >> _______________________________________________
>> >>> >> Analytics mailing list
>> >>> >> Analytics@lists.wikimedia.org
>> >>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >
>> >>> >
>> >>> > _______________________________________________
>> >>> > Analytics mailing list
>> >>> > Analytics@lists.wikimedia.org
>> >>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Oliver Keyes
>> >>> Count Logula
>> >>> Wikimedia Foundation
>> >>>
>> >>> _______________________________________________
>> >>> Analytics mailing list
>> >>> Analytics@lists.wikimedia.org
>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Analytics mailing list
>> >> Analytics@lists.wikimedia.org
>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > Analytics@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>> >
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > Analytics@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>>
>>
>>
>> --
>> Oliver Keyes
>> Count Logula
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
Oliver Keyes
Count Logula
Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics