>So maybe it's worth considering which approach takes us closer to that? AIUI the beacon puts the record into the webrequest table and from there it would only take some >trivial preprocessing to replace the beacon URL with the virtual URL and and add the beacon type as a "virtual_type" field or something, making it very easy to expose it >everywhere where views are tracked, while EventLogging data gets stored in a different, unrelated way.
Any thing that involves combing 1 terabyte of data a day and 150.000 request s per second at peak cannot be consider "simple" or "trivial". 
Rather than looking for a needle in the haystack rely let's please on the client to send you preselected data (events). That data can be aggregated later in different ways, and the fact that the data comes from event logging does not dictate how aggregation needs to happen 




On Wed, Jan 17, 2018 at 6:09 PM, Gergo Tisza <gtisza@wikimedia.org> wrote:
On Wed, Jan 17, 2018 at 10:54 AM, Nuria Ruiz <nuria@wikimedia.org> wrote:
Recording "preview_events" is really no different that recording any other kind of UI event, difference is going to come from scale if anything, as they are probably tens of thousands of those per second (I think your team already estimated volume, if so please send those estimates along)

Conceptually I think a virtual pageview is a different thing from a UI event (which is how e.g. Google Analytics handles it, there is a method to send an event for the current page and a different method to send a virtual pageview for a different page), and the ideal way it is exposed in an analytics system should be very different. (I would want to see virtual pageviews together with normal pageviews, with some filtering option. If I deploy code that shows previews and converts users from making real pageviews to making virtual pageviews, I want to see how the total pageviews changed in the normal pageview stats; I don't want to have to create that chart and export one dataset from pageviews and one dataset from eventlogging to do that. As a user, I want to see in the fileview API how many people looked at the photo I uploaded, I don't particularly care if they used MediaViewer or not. etc.)

So maybe it's worth considering which approach takes us closer to that? AIUI the beacon puts the record into the webrequest table and from there it would only take some trivial preprocessing to replace the beacon URL with the virtual URL and and add the beacon type as a "virtual_type" field or something, making it very easy to expose it everywhere where views are tracked, while EventLogging data gets stored in a different, unrelated way.