If we /do/ include RESTBase requests we will not
only have to
rewrite the pageview definition for the apps to recognise the new URL
scheme
I really think that apps and APIs should do something proactive to tag or
log a pageview. With more ways of viewing content, it is going to get
harder and harder to maintain a pattern based definition. A pageview should
be an event that is logged, not something that is pattern matched out of a
very noisy stream of data.
Most mediawiki requests do this now, via the page_id field in the
X-Analytlics header, but we can’t use this for all pageviews because APIs
are more complicated (e.g. more than one page can be served in a single
request, etc.). In the longterm, there should be a pageview event stream
just like rcstream! :)
-Ao
On Aug 18, 2015, at 19:58, Oliver Keyes
<okeyes(a)wikimedia.org> wrote:
On 18 August 2015 at 19:11, Bernd Sitzmann <bernd(a)wikimedia.org> wrote:
This discussion is about needed updates of the
definition and Analytics
implementation for mobile apps page view metrics. There is also an
associated Phab task[4]. Please add the proper Analytics project there.
Background / Changes
As you probably remember, the Android app splits a page view into two
requests: one for the lead section and metadata, plus another one for
the
remainder.
The mobile apps are going to change the way they load pages in two
different
ways:
We'll add a link preview when someone clicks on a link from a page.
We're planning on switching over the using RESTBase for loading pages
and
also the link preview (initially just the Android beta, ater more)
Woah woah woah woah woah. By RESTBase do you mean Gabriel's RESTful
service API?
Last time I checked that wasn't even consumed by HDFS. Is it now being
consumed by HDFS?
More importantly the actual URLs are going to look /totally/
different. If we do not include RESTBase requests, we will miss the
apps. If we /do/ include RESTBase requests we will not only have to
rewrite the pageview definition for the apps to recognise the new URL
scheme, we will also potentially have to rewrite every /other/ bit of
the definition to /not/ incorporate those requests.
(I use "we" in a collective sense. This isn't my baby any more,
although if Joseph et al want help with the refactor here I'm happy to
spend my volunteer time on it).
But basically every other bit of your email is important but now
secondary: this is a potentially massive change, all on its own, even
without the link preview, even if the substance of the requests going
to RESTBase were identical.
This will have implications for the pageviews
definition and how we
count
user engagement.
The big question is
Should we count link previews as a page view since it's an indication
of
user engagement? Or should there be a separate metric for link
previews?
Counting page views
IIRC we currently count action=mobileview§ions=0 query parameters
of
api.php as a page view. When we publish link previews for all Android
app
users then we would either want to count also the calls to
action=query&prop=extracts as a page view or add them to another
metric.
Once the apps use RESTBase the HTTPS requests will be very different:
Page view: Instead of action=mobileview§ions=0 the app would call
the
RESTBase endpoint for lead request[1] instead of the PHP API mentioned
above. Then it would call [2].
Link preview: Instead of action=query&prop=extracts it would call the
lead
request[1], too, since there is a lot of overlap. At least that our
current
plan. The advantage of that is that the client doesn't need to execute
the
lead request a second time if the user clicks on the link preview (--
either
through caching or app logic.)
So, in the RESTBase case we either want to count the
mobile-html-sections-lead requests or the
mobile-html-sections-remaining
requests depending on what our definition for page views actually is.
We
could also add a query parameter or extra HTTP header to one of the
mobile-html-sections-lead requests if we need to distinguish between
previews and page views.
Both the current PHP API and the RESTBase based metrics would need to
be
compatible and be collected in parallel since we cannot control when
users
update their apps.
[1]
https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-lead/Dilbert
[2]
https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-remaining/Di…
[3]
https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_ap…
[4]
https://phabricator.wikimedia.org/T109383
Cheers,
Bernd
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Oliver Keyes
Count Logula
Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org