On Fri, Nov 20, 2009 at 10:47 PM, Ryan Chan <ryanchan404(a)gmail.com> wrote:
Any reason I would like to ask is why not use
PostgreSQL?
Seems MySQL is not suitable for handling large table (e.g. over few
GB), I just wonder why wikipedia don't use PostgreSQL?
It should provide better performance.
MySQL is easily capable of handling very large tables, if used
properly. Certainly tables the size of Wikipedia's (which aren't very
big by DB standards). Selecting a list of all titles that are not
redirects will take a long time on any database, unless you have
everything in memory, because it requires a table scan -- there's no
index that covers the relevant columns (IIRC). Of course, if you
don't configure MySQL properly, or don't give it a reasonable amount
of hardware, it will perform poorly, but the database is not much
overtaxed on Wikipedia right now.
It's also worth pointing out that Wikipedia uses a version of MySQL
with substantial modifications, and Wikimedia sysadmins are very
familiar with its behavior. Switching to a new technology might
theoretically be better in the long term (although I wouldn't take
that for granted in this case), but the transition cost would be
substantial. Heck, Wikipedia hasn't even upgraded to MySQL 4.1, let
alone a whole different DBMS.