On Fri, Nov 20, 2009 at 10:47 PM, Ryan Chan ryanchan404@gmail.com wrote:
Any reason I would like to ask is why not use PostgreSQL?
Seems MySQL is not suitable for handling large table (e.g. over few GB), I just wonder why wikipedia don't use PostgreSQL?
It should provide better performance.
MySQL is easily capable of handling very large tables, if used properly. Certainly tables the size of Wikipedia's (which aren't very big by DB standards). Selecting a list of all titles that are not redirects will take a long time on any database, unless you have everything in memory, because it requires a table scan -- there's no index that covers the relevant columns (IIRC). Of course, if you don't configure MySQL properly, or don't give it a reasonable amount of hardware, it will perform poorly, but the database is not much overtaxed on Wikipedia right now.
It's also worth pointing out that Wikipedia uses a version of MySQL with substantial modifications, and Wikimedia sysadmins are very familiar with its behavior. Switching to a new technology might theoretically be better in the long term (although I wouldn't take that for granted in this case), but the transition cost would be substantial. Heck, Wikipedia hasn't even upgraded to MySQL 4.1, let alone a whole different DBMS.