On Fri, Dec 6, 2013 at 10:17 AM, Dario Taraborelli <dtaraborelli@wikimedia.org> wrote:
some thoughts:

• All data in the log DB strictly follows https://meta.wikimedia.org/wiki/Schema:EventCapsule. This includes fields such as seqId and uuid that allow recovery of data from the raw JSON dumps. Should something catastrophic happen, we could restore the entire DB by re-importing raw JSON data, which is guaranteed to match the EventCapsule specs. This wouldn’t apply to any custom table created in the DB with data from a different source.

• For the same reason, should a global change apply to EventCapsule (for example https://bugzilla.wikimedia.org/show_bug.cgi?id=52295 ) all tables would need to have their schema updated. Hosting custom tables with arbitrary schemas not matching EventCapsule specs would make global updates unnecessarily complicated. 

• Writing of data into the log DB is intentionally restricted to the eventlog user, which was created for the unique purpose to autogenerate tables and write data into SQL when new schemas are deployed in production. Making an exception to this principle sets a precedent whereby humans and other scripts can arbitrarily manipulate data or create tables in the DB, which is a first step towards turning the log db into the same zoo that the staging db is.


/me quickly and quietly creeps back into the corner /me came from

That makes sense Dario.  Guys, feel free to set this up however you best see fit, and let me know if you need any help.