RESTBase is behind regular text Varnishes, so I suspect that the logs might already end up in HDFS. All entry points start with /api/rest_v1/, which shouldn't overlap with potential page views counted at /w/api.php or /wiki/.


On Tue, Aug 18, 2015 at 6:32 PM, Kevin Leduc <kevin@wikimedia.org> wrote:
We briefly considered counting views of Hover Cards as Pageviews, but it was quickly dismissed.  First, the feature is not widely used enough to justify Changing the pageview definition.

I'm still open to counting previews as pageviews, but I think the Readership team and their product managers need to weigh in heavily as Pageviews is a key metric for them.

Finally, counting Pageviews served through RESTBase sounds like a new project and I'd like to hear more about the effort needed from the analytics engineers.


This reminds me.. right now we don't allow Varnishes to cache any content, but we plan to start allowing this soon. At that point, internal RESTBase metrics like http://grafana.wikimedia.org/#/dashboard/db/restbase?panelId=8&fullscreen will only show the cache misses. For our purposes it would be super useful to keep track of total requests matching /api/rest_v1/. This will let us track overall API usage, which is going to be our primary KPI for now. I have created a ticket for this at https://phabricator.wikimedia.org/T109547.

Thanks!

Gabriel

 


On Tue, Aug 18, 2015 at 4:58 PM, Oliver Keyes <okeyes@wikimedia.org> wrote:
On 18 August 2015 at 19:11, Bernd Sitzmann <bernd@wikimedia.org> wrote:
> This discussion is about needed updates of the definition and Analytics
> implementation for mobile apps page view metrics. There is also an
> associated Phab task[4]. Please add the proper Analytics project there.
>
> Background / Changes
>
> As you probably remember, the Android app splits a page view into two
> requests: one for the lead section and metadata, plus another one for the
> remainder.
>
> The mobile apps are going to change the way they load pages in two different
> ways:
>
> We'll add a link preview when someone clicks on a link from a page.
> We're planning on switching over the using RESTBase for loading pages and
> also the link preview (initially just the Android beta, ater more)
>

Woah woah woah woah woah. By RESTBase do you mean Gabriel's RESTful service API?

Last time I checked that wasn't even consumed by HDFS. Is it now being
consumed by HDFS?

More importantly the actual URLs are going to look /totally/
different. If we do not include RESTBase requests, we will miss the
apps. If we /do/ include RESTBase requests we will not only have to
rewrite the pageview definition for the apps to recognise the new URL
scheme, we will also potentially have to rewrite every /other/ bit of
the definition to /not/ incorporate those requests.

(I use "we" in a collective sense. This isn't my baby any more,
although if Joseph et al want help with the refactor here I'm happy to
spend my volunteer time on it).

But basically every other bit of your email is important but now
secondary: this is a potentially massive change, all on its own, even
without the link preview, even if the substance of the requests going
to RESTBase were identical.

> This will have implications for the pageviews definition and how we count
> user engagement.
>
> The big question is
>
> Should we count link previews as a page view since it's an indication of
> user engagement? Or should there be a separate metric for link previews?
>
> Counting page views
>
> IIRC we currently count action=mobileview&sections=0 query parameters of
> api.php as a page view. When we publish link previews for all Android app
> users then we would either want to count also the calls to
> action=query&prop=extracts as a page view or add them to another metric.
>
> Once the apps use RESTBase the HTTPS requests will be very different:
>
> Page view: Instead of action=mobileview&sections=0 the app would call the
> RESTBase endpoint for lead request[1] instead of the PHP API mentioned
> above. Then it would call [2].
> Link preview: Instead of action=query&prop=extracts it would call the lead
> request[1], too, since there is a lot of overlap. At least that our current
> plan. The advantage of that is that the client doesn't need to execute the
> lead request a second time if the user clicks on the link preview (-- either
> through caching or app logic.)
>
> So, in the RESTBase case we either want to count the
> mobile-html-sections-lead requests or the mobile-html-sections-remaining
> requests depending on what our definition for page views actually is. We
> could also add a query parameter or extra HTTP header to one of the
> mobile-html-sections-lead requests if we need to distinguish between
> previews and page views.
>
> Both the current PHP API and the RESTBase based metrics would need to be
> compatible and be collected in parallel since we cannot control when users
> update their apps.
>
> [1]
> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-lead/Dilbert
> [2]
> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-remaining/Dilbert
> [3]
> https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_apps
>
> [4] https://phabricator.wikimedia.org/T109383
>
>
> Cheers,
>
> Bernd
>
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



--
Oliver Keyes
Count Logula
Wikimedia Foundation

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics




--
Gabriel Wicke
Principal Engineer, Wikimedia Foundation