It does still
seem to me that the data to determine secondary api
requests
should already be present in the existing log
line. If the value of the
page param in an action=mobileview api request matches the page in the
referrer (perhaps with normalization), it's a secondary request as per
case
1 below. Otherwise, it's a pageview as per
case 2. Difficult or
expensive
to reconcile? Not when you're doing
distributed log analysis via hadoop.
So I did look into this prior to writing the RFC and the issue is that a
lot of API referrers don't contain the querystring. I don't know what
triggers this so if we can fix this then we can definitely derive the
secondary pageview request from the referrer field.
D
If you can point me to some examples, I'll see if I can find any insights
into the behavior.
On Mon, Feb 11, 2013 at 7:11 PM, Arthur Richards
<
arichards(a)wikimedia.org
wrote:
> Thanks, Jon. To try and clarify a bit more about the API requests...
they
are not
made on a per-section basis. As I mentioned earlier, there are
two
cases in which article content gets loaded by the
API:
1) Going directly to a page (eg clicking a link from a Google search)
will
result in the backend serving a page with ONLY
summary section content
and
section headers. The rest of the page is lazily
loaded via API request
once
the JS for the page gets loaded. The idea is to
increase responsiveness
by
> reducing the delay for an article to load (further details in the
article
Jon
previously linked to). The API request looks like:
http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&p…
2) Loading an article entirely via Javascript - like when a link is
clicked
in an article to another article, or an article
is loaded via search.
This
> will make ONE call to the API to load article content. API request
looks
http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&p…
>
> These API requests are identical, but only #2 should be counted as a
> 'pageview' - #1 is a secondary API request and should not be counted
as a
'pageview'. You could make the argument that we just count all of these
API
> requests as pageviews, but there are cases when we can't load article
> content from the API (like devices that do not support JS), so we need
to
be able
to count the traditional page request as a pageview - thus we
need
> a way to differentiate the types of API requests being made when they
> otherwise share the same URL.
>
>
>
> On Mon, Feb 11, 2013 at 6:42 PM, Jon Robson <jdlrobson(a)gmail.com>
wrote:
>
> > I'm a bit worried that now we are asking why pages are lazy loaded
> > rather than focusing on the fact that they currently __are doing
> > this___ and how we can log these (if we want to discuss this further
> > let's start another thread as I'm getting extremely confused doing so
> > on this one).
> >
> > Lazy loading sections
> > ################
> > For motivation behind moving MobileFrontend into the direction of
lazy
> > loading section content and subsequent
pages can be found here [1], I
> > just gave it a refresh as it was a little out of date.
> >
> > In summary the reason is to
> > 1) make the app feel more responsive by simply loading content rather
> > than reloading the entire interface
> > 2) reducing the payload sent to a device.
> >
> > Session Tracking
> > ################
> >
> > Going back to the discussion of tracking mobile page views, it sounds
> > like a header stating whether a page is being viewed in alpha, beta
or
> > > stable works fine for standard page views.
> > >
> > > As for the situations where an entire page is loaded via the api it
> > > makes no dif