(Moving ops list to bcc)

>Are there other ways of recording this information? We're fairly confident that #1 seems like the best choice here but it's referred to as the "virtual file view hack". Is this really the case?
Yes, there are, please use eventlogging.

Recording "preview_events" is really no different that recording any other kind of UI event, difference is going to come from scale if anything, as they are probably tens of thousands of those per second (I think your team already estimated volume, if so please send those estimates along)

We discourage you from sending events directly to beacon. Rather, use the EL client to send a page-preview event defined in a given schema. This is a similar approach as to how we will be measuring banner impressions for fundraising banners in the future.



On Wed, Jan 17, 2018 at 1:51 AM, Sam Smith <samsmith@wikimedia.org> wrote:

Page Previews is now fully deployed to all but 2 of the Wikipedias. In deploying it, we've created a new way to interact with pages without navigating to them. This impacts the overall and per-page pageviews metrics that are used in myriad reports, e.g. to editors about the readership of their articles and in monthly reports to the board. Consequently, we need to be able to report a user reading the preview of a page just like we do them navigating to it.

Readers Web are planning to instrument Page Previews such that when a preview is available and open for longer than X ms, a "page interaction" is recorded. We're aware of a couple of mechanisms for recording something like this from the client:
  1. All files viewed with the media viewer are recorded by the client requesting the /beacon/media?duration=X&uri=Y URL at some point [0] – as Nuria points out in that thread, requests to /beacon/... are already filtered and a canned response is sent immediately by Varnish [1].
  2. Requesting a URL with the X-Analytics header [2] set to "preview". In this context, we'd make a HEAD request to the URL of the page with the header set.
IMO #1 is preferable from the operations and performance perspectives as the response is always served from the edge and includes very few headers, whereas the request in #2 may be served by the application servers if the user is logged in (or in the mobile site's beta cohort). However, the requests in #2 are already 

We're currently considering recording page interactions when previews are open for longer than 1000 ms. We estimate that this would increase overall web requests by 0.3% [3].

Are there other ways of recording this information? We're fairly confident that #1 seems like the best choice here but it's referred to as the "virtual file view hack". Is this really the case? Moreover, should we request a distinct URL, e.g. /beacon/preview?duration=X&uri=Y, or should we consolidate the URLs as both represent the same thing essentially?



Timezone: GMT
IRC (Freenode): phuedx

[0] https://lists.wikimedia.org/pipermail/analytics/2015-March/003633.html
[1] https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/varnish/templates/vcl/wikimedia-frontend.vcl.erb;1bce79d58e03bd02888beef986c41989e8345037$269

Analytics mailing list