On Sat, Feb 23, 2008 at 2:23 AM, Tim Starling tstarling@wikimedia.org wrote:
David Gerard wrote:
Previous efforts in this direction have worked well in testing on Postgres, but foundered on MySQL not being up to the task. Since the chance of Wikimedia moving off MySQL is about zero, a solution that works on MySQL would be *wonderful*.
There's no need to move off MySQL in order to move on to PostgreSQL. We already have data stored in Lucene, Berkley DB and Memcached. Although it would be easier for system administration if we could use an existing installation of something, I certainly don't see it as a killer to a PostgreSQL solution.
If, at one point, we have the number of articles per category readily available, can't we just exclude large categories? The rest should work nicely with joins, subqueries, etc. It might even be fastest to get two full lists of category members (should cache well), then do the join in PHP.
We'll lose the ability to do intersections with "living people", maintenance and license categories; but, compared to "no intersections" now, I could live with it for a while :-)
Magnus