Page Previews is now fully deployed to all but 2 of the Wikipedias. In
deploying it, we've created a new way to interact with pages without
navigating to them. This impacts the overall and per-page pageviews metrics
that are used in myriad reports, e.g. to editors about the readership of
their articles and in monthly reports to the board. Consequently, we need
to be able to report a user reading the preview of a page just like we do
them navigating to it.
Readers Web are planning to instrument Page Previews such that when a
preview is available and open for longer than X ms, a "page interaction" is
recorded. We're aware of a couple of mechanisms for recording something
like this from the client:
1. All files viewed with the media viewer are recorded by the client
requesting the /beacon/media?duration=X&uri=Y URL at some point  – as
Nuria points out in that thread, requests to /beacon/... are already
filtered and a canned response is sent immediately by Varnish .
2. Requesting a URL with the X-Analytics header  set to "preview". In
this context, we'd make a HEAD request to the URL of the page with the
IMO #1 is preferable from the operations and performance perspectives as
the response is always served from the edge and includes very few headers,
whereas the request in #2 may be served by the application servers if the
user is logged in (or in the mobile site's beta cohort). However, the
requests in #2 are already
We're currently considering recording page interactions when previews are
open for longer than 1000 ms. We estimate that this would increase overall
web requests by 0.3% .
Are there other ways of recording this information? We're fairly confident
that #1 seems like the best choice here but it's referred to as the
"virtual file view hack". Is this really the case? Moreover, should we
request a distinct URL, e.g. /beacon/preview?duration=X&uri=Y, or should we
consolidate the URLs as both represent the same thing essentially?
IRC (Freenode): phuedx