I don't see how this addresses Gergo's larger
point about the difference
between consistently tallying content consumption
(pageviews, previews,
mediaviewer image views) >and analyzing UI interactions (which is the main
use case that EventLogging has been developed and used for).
Event logging use cases are events, as we move to a thicker client -more
javascript heavy- you will be needing to measure events for -nearly-
everything, whether those are to be consider "content consumption" or "ui
interaction" is not that relevant. Example: video plays are content
consumption and are also "ui interactions".
We are the only major website that does not have a thick client and this
notion of joining UI interactions and consumption is new to us but really
it is not that new at all.
On Thu, Jan 18, 2018 at 3:17 PM, Tilman Bayer <tbayer(a)wikimedia.org> wrote:
>
> On Thu, Jan 18, 2018 at 8:16 AM, Nuria Ruiz <nuria(a)wikimedia.org> wrote:
>
>> Gergo,
>>
>> >while EventLogging data gets stored in a different, unrelated way
>> Not really, This has changed quite a bit as of the last two quarters.
>> Eventlogging data as of recent gets preprocessed and refined similar to how
>> webrequest data is preprocessed and refined. You can have a dashboard on
>> top of some eventlogging schemas on superset in the same way you have a
>> dashboard that displays pageview data on superset.
>>
>
I don't see how this addresses Gergo's larger
point about the difference
> between consistently tallying content consumption
(pageviews, previews,
> mediaviewer image views) and analyzing UI interactions (which is the main
> use case that EventLogging has been developed and used for). There are
> really quite a few differences between these two. For example, UI
> instrumentations on the web are almost always sampled, because that yields
> enough data to answer UI questions - but on the other hand tend to record
> much more detail about the individual interaction. In contrast, we register
> all pageviews unsampled, but don't keep a permanent record of every single
> one of them with precise timestamps - rather, we have aggregated tables
> (pageview_hourly in particular). Our EventLogging backend is not tailored
> to that.
>
>
>
>>
>> See dashboards on superset (user required).
>>
>>
https://superset.wikimedia.org/superset/dashboard/7/?presele
>> ct_filters=%7B%7D
>>
>> And (again, user required) EL data on druid, this very same data we are
>> talking about, page previews:
>>
>>
https://pivot.wikimedia.org/#tbayer_popups
>>
>
> That's actually not the "very same data we are talking about". You can
> rest assured that the web team (and Sam in particular) has already been
> aware of the existence of the Popups instrumentation for page previews. The
> team spent considerable effort building it in order to understand how users
> interact with the feature's UI. Now comes the separate effort of
> systematically tallying content consumption from this new channel. Superset
> and Pivot are great, but are nowhere near providing all the ways that WMF
> analysts and community members currently have to study pageview data.
> Storing data about seen previews in the same way as we do for pageviews,
> for example in the pageview_hourly (suitably tagged, perhaps giving that
> table a more general name) would facilitate that a lot, by allowing us to
> largely reuse the work that during the past few years went into getting
> pageview aggregation right.
>
>
>>
>> >I was going to make the point that #2 already has a processing pipeline
>> established whereas #1 doesn't.
>> This is incorrect, we mark as "preview" data that we want to exclude
>> from processing, see:
>>
https://github.com/wikimedia/analytics-refinery-source/blob/
>> master/refinery-core/src/main/java/org/wikimedia/analytics/r
>> efinery/core/PageviewDefinition.java#L144
>> Naming is unfortunate but previews are really "preloads" as in
requests
>> we make (and cache locally) and maybe shown to users or not.
>>
>>
>> But again, tracking of events is better done on an event based system and
>> EL is such a system.
>>
>>
>> Again, tracking of individual events is not the ultimate goal here.
>
>
> --
> Tilman Bayer
> Senior Analyst
> Wikimedia Foundation
> IRC (Freenode): HaeB
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/analytics
>
>