It does still seem to me that the data to determine
secondary api requests
should already be present in the existing log line. If the value of the
page param in an action=mobileview api request matches the page in the
referrer (perhaps with normalization), it's a secondary request as per case
1 below. Otherwise, it's a pageview as per case 2. Difficult or expensive
to reconcile? Not when you're doing distributed log analysis via hadoop.
So I did look into this prior to writing the RFC and the issue is that a
lot of API referrers don't contain the querystring. I don't know what
triggers this so if we can fix this then we can definitely derive the
secondary pageview request from the referrer field.
D
On Mon, Feb 11, 2013 at 7:11 PM, Arthur Richards
<arichards(a)wikimedia.org
wrote:
Thanks, Jon. To try and clarify a bit more about
the API requests... they
are not made on a per-section basis. As I mentioned earlier, there are
two
cases in which article content gets loaded by the
API:
1) Going directly to a page (eg clicking a link from a Google search)
will
result in the backend serving a page with ONLY
summary section content
and
section headers. The rest of the page is lazily
loaded via API request
once
the JS for the page gets loaded. The idea is to
increase responsiveness
by
reducing the delay for an article to load
(further details in the article
Jon previously linked to). The API request looks like:
http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&p…
2) Loading an article entirely via Javascript - like when a link is
clicked
in an article to another article, or an article
is loaded via search.
This
will make ONE call to the API to load article
content. API request looks
like:
http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&p…
These API requests are identical, but only #2 should be counted as a
'pageview' - #1 is a secondary API request and should not be counted as a
'pageview'. You could make the argument that we just count all of these
API
requests as pageviews, but there are cases when
we can't load article
content from the API (like devices that do not support JS), so we need to
be able to count the traditional page request as a pageview - thus we
need
a way to differentiate the types of API requests
being made when they
otherwise share the same URL.
On Mon, Feb 11, 2013 at 6:42 PM, Jon Robson <jdlrobson(a)gmail.com> wrote:
> I'm a bit worried that now we are asking why pages are lazy loaded
> rather than focusing on the fact that they currently __are doing
> this___ and how we can log these (if we want to discuss this further
> let's start another thread as I'm getting extremely confused doing so
> on this one).
>
> Lazy loading sections
> ################
> For motivation behind moving MobileFrontend into the direction of lazy
> loading section content and subsequent pages can be found here [1], I
> just gave it a refresh as it was a little out of date.
>
> In summary the reason is to
> 1) make the app feel more responsive by simply loading content rather
> than reloading the entire interface
> 2) reducing the payload sent to a device.
>
> Session Tracking
> ################
>
> Going back to the discussion of tracking mobile page views, it sounds
> like a header stating whether a page is being viewed in alpha, beta or
> stable works fine for standard page views.
>
> As for the situations where an entire page is loaded via the api it
> makes no difference to us to whether we
> 1) send the same header (set via javascript) or
> 2) add a query string parameter.
>
> The only advantage I can see of using a header is that an initial page
> load of the article San Francisco currently uses the same api url as a
> page load of the article San Francisco via javascript (e.g. I click a
> link to 'San Francisco' on the California article).
>
> In this new method they would use different urls (as the data sent is
> different). I'm not sure how that would effect caching.
>
> Let us know which method is preferred. From my perspective
> implementation of either is easy.
>
> [1]
http://www.mediawiki.org/wiki/MobileFrontend/Dynamic_Sections
>
> On Mon, Feb 11, 2013 at 12:50 PM, Asher Feldman <
afeldman(a)wikimedia.org>
> wrote:
> > Max - good answers re: caching concerns. That leaves studying if the
> bytes
> > transferred on average mobile article view increases or decreases
with
> lazy
> > section loading. If it increases, I'd say this isn't a positive
> direction
> > to go in and stop there. If it decreases, then we should look at the
> > effect on total latency, number of requests required per pageview,
and
the
> impact on backend apache utilization which I'd expect to be > 0.
>
> Does the mobile team have specific goals that this project aims to
> accomplish? If so, we can use those as the measure against which to
> compare an impact analysis.
>
> On Mon, Feb 11, 2013 at 12:21 PM, Max Semenik <maxsem.wiki(a)gmail.com
> wrote:
> >
> >> On 11.02.2013, 22:11 Asher wrote:
> >>
> >> > And then I'd wonder about the server side implementation. How
will
> >> frontend
> >> > cache invalidation work? Are we going to need to purge every
> individual
> >> > article section relative to /w/api.php on edit?
> >>
> >> Since the API doesn't require pretty URLs, we could simply append
the
> >> current revision ID to the
mobileview URLs.
> >>
> >> > Article HTML in memcached
> >> > (parser cache), mobile processed HTML in memcached.. Now
individual
> > sections in memcached? If so, should we
calculate memcached space
needs
> for
> > article text as 3x the current parser cache utilization? More
memcached
>> > usage is great, not asking to dissuade its use but because its
better
> to
> >> > capacity plan than to react.
> >>
> >> action=mobileview caches pages only in full and serves
> >> only sections requested, so no changes in request patterns will
result
> in increased memcached usage.
>
> --
> Best regards,
> Max Semenik ([[User:MaxSem]])
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Jon Robson
http://jonrobson.me.uk
@rakugojon
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Arthur Richards
Software Engineer, Mobile
[[User:Awjrichards]]
IRC: awjr
+1-415-839-6885 x6687
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l