> the beacon puts the record into the webrequest table and from there it would only take some trivial preprocessing
‘Trivial’ preprocessing that has to look through 150K requests per second! This is a lot of work!
> tracking of events is better done on an event based system and EL is such a system.
I agree with this too. We really want to discourage people from trying to measure things by searching through the huge haystack of all webrequests. To measure something, you should emit an event if you can. If it were practical, I’d prefer that we did this for pageviews as well. Currently, we need a complicated definition of what a pageview is, which really only exists in the Java implementation in the Hadoop cluster. It’d be much clearer if app developers had a way to define themselves what counts as a pageview, and emit that as an event.
This should be the approach that people take when they want to measure something new. Emit an event! This event will get its own Kafka topic (you can consume this to do whatever you like with it), and be refined into its own Hive table.
> I don’t want to have to create that chart and export one dataset from pageviews and one dataset from eventlogging to do that.