On Thu, Dec 11, 2014 at 4:11 PM, Dario Taraborelli <dtaraborelli@wikimedia.org> wrote:
is there a way to inspect invalid events in near real time without having access to vanadium?

There's this graph: https://graphite.wikimedia.org/render/?width=586&height=308&_salt=1418343627.977&from=-1weeks&target=movingMedian(diffSeries(eventlogging.overall.raw.rate%2Ceventlogging.overall.valid.rate)%2C20)

The key is 'diffSeries(eventlogging.overall.raw.rate,eventlogging.overall.valid.rate)', which gets you the rate of invalid events per second.

It is not broken down by schema, though.

We can't write invalid events to a database -- at least not the same way we write well-formed events. The table schema is derived from the event schema, so an invalid event would violate the constraints of the table as well.

It's possible (and easy) to set something up that watches invalid events in real-time and does something with them. The question is: what? E-mail an alert? Produce a daily report? Generate a graph?

If you describe how you'd like to consume the data, I can try to hash out an an implementation with Nuria and Christian.