On Thu, Feb 21, 2008 at 12:56 PM, Alex Powell alexp@exscien.com wrote:
Yes live updates could go to both tables, or you can just do batch updates in the scripts that catch up with the live table. The main idea is not to lock the table to writes - I presumed that it would affect the site, as how do edits proceed whilst the table is locked??
The magic of replication. :) The one doing the update goes through each slave: taking it out of rotation, applying the change, re-adding it to rotation, and let it catch up to the master, then repeating with the next slave. Finally the master is switched, so that a former slave (with updated schema) is the new master, and the process is repeated one last time for the old master. The alter still locks the tables, but it does so while the slave is offline, so it doesn't block anything, and there are enough slaves that the site doesn't suffer too much from being short one for a while.
Of course, this depends on the schemas being similar enough that replicated statements still can be executed on the new database. But since, helpfully, Wikimedia develops its own software, it can ensure that that's always the case. It's mostly a case of ordering everything right: add any new fields, then update software so it no longer needs old fields, then remove old fields.
RE MySQL I expect that its a hard bit of code for a diminishingly small number of users (qv most mediawiki updates only targeted at core audience). After all alter table on a few hundred thousand record table is still quite quick. Maybe the later versions are better at this?
They aren't. Falcon, for MySQL 6.0, is supposed to support online alter table, but I don't think implementation on that feature has started yet, and the feature list is subject to change this early on. It's really a big pain for anyone who has a decent-sized database. So I figure if it weren't hard to do, it would be in the software already.