On Tuesday, February 12, 2013, Diederik van Liere wrote:
It does still seem to me that the data to determine secondary api
requests
should already be present in the existing log line. If the value of the page param in an action=mobileview api request matches the page in the referrer (perhaps with normalization), it's a secondary request as per
case
1 below. Otherwise, it's a pageview as per case 2. Difficult or
expensive
to reconcile? Not when you're doing distributed log analysis via hadoop.
So I did look into this prior to writing the RFC and the issue is that a lot of API referrers don't contain the querystring. I don't know what triggers this so if we can fix this then we can definitely derive the secondary pageview request from the referrer field. D
If you can point me to some examples, I'll see if I can find any insights into the behavior.
On Mon, Feb 11, 2013 at 7:11 PM, Arthur Richards <
arichards@wikimedia.org
wrote:
Thanks, Jon. To try and clarify a bit more about the API requests...
they
are not made on a per-section basis. As I mentioned earlier, there are
two
cases in which article content gets loaded by the API:
- Going directly to a page (eg clicking a link from a Google search)
will
result in the backend serving a page with ONLY summary section content
and
section headers. The rest of the page is lazily loaded via API request
once
the JS for the page gets loaded. The idea is to increase responsiveness
by
reducing the delay for an article to load (further details in the
article
Jon previously linked to). The API request looks like:
http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&pa...
- Loading an article entirely via Javascript - like when a link is
clicked
in an article to another article, or an article is loaded via search.
This
will make ONE call to the API to load article content. API request
looks
like:
http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&pa...
These API requests are identical, but only #2 should be counted as a 'pageview' - #1 is a secondary API request and should not be counted
as a
'pageview'. You could make the argument that we just count all of these
API
requests as pageviews, but there are cases when we can't load article content from the API (like devices that do not support JS), so we need
to
be able to count the traditional page request as a pageview - thus we
need
a way to differentiate the types of API requests being made when they otherwise share the same URL.
On Mon, Feb 11, 2013 at 6:42 PM, Jon Robson jdlrobson@gmail.com
wrote:
I'm a bit worried that now we are asking why pages are lazy loaded rather than focusing on the fact that they currently __are doing this___ and how we can log these (if we want to discuss this further let's start another thread as I'm getting extremely confused doing so on this one).
Lazy loading sections ################ For motivation behind moving MobileFrontend into the direction of
lazy
loading section content and subsequent pages can be found here [1], I just gave it a refresh as it was a little out of date.
In summary the reason is to
- make the app feel more responsive by simply loading content rather
than reloading the entire interface 2) reducing the payload sent to a device.
Session Tracking ################
Going back to the discussion of tracking mobile page views, it sounds like a header stating whether a page is being viewed in alpha, beta
or
stable works fine for standard page views.
As for the situations where an entire page is loaded via the api it makes no dif