@otto: I believe that the high throughput bulk input use case will be difficult for a relational db (e.g. postgres) to handle. It will be interesting to see how well cassandra can handle the queries that people want to run. Tradeoffs...

@dario: RestBASE and Cassandra are definitely different things; we should take care to not conflate them.

-Toby

On Tue, Jun 9, 2015 at 9:02 AM, Andrew Otto <aotto@wikimedia.org> wrote:
p.s. I will never drink Bud Lite Lime.  Like, never.

You have the wrong attitude.  Pretend it is beer-soda, not beer.  Beer + sprite is yummy!

(I’ll let someone else figure out how this advice also applies to the database analogy.)






On Jun 8, 2015, at 19:52, Dan Andreescu <dandreescu@wikimedia.org> wrote:

As always, I'd recommend that we go with tech we are familiar with -- mysql or cassandra. We have a cassandra committer on staff who would be able to answer these questions in detail.

WMF uses PostGRES for some things, no?  Or is that is just in labs?

Since this data is meant to be fully public and queryable in any way, we could put it in the PostgreSQL instance on labs.  We should check with labs folks, perhaps horse trade some hardware, but I think that would be a splendid solution.

However, and I'm trying to understate this in case people are not familiar with my hyperbolic style, I'd rather drink Bud Lite Lime than use MySQL for this.  MySQL is suited for a lot of things, but analytics is not one of them.

p.s. I will never drink Bud Lite Lime.  Like, never.
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics