On Thu, Feb 21, 2008 at 12:56 PM, Alex Powell <alexp(a)exscien.com> wrote:
Yes live updates could go to both tables, or you can
just do batch
updates in the scripts that catch up with the live table. The main
idea is not to lock the table to writes - I presumed that it would
affect the site, as how do edits proceed whilst the table is locked??
The magic of replication. :) The one doing the update goes through
each slave: taking it out of rotation, applying the change, re-adding
it to rotation, and let it catch up to the master, then repeating with
the next slave. Finally the master is switched, so that a former
slave (with updated schema) is the new master, and the process is
repeated one last time for the old master. The alter still locks the
tables, but it does so while the slave is offline, so it doesn't block
anything, and there are enough slaves that the site doesn't suffer too
much from being short one for a while.
Of course, this depends on the schemas being similar enough that
replicated statements still can be executed on the new database. But
since, helpfully, Wikimedia develops its own software, it can ensure
that that's always the case. It's mostly a case of ordering
everything right: add any new fields, then update software so it no
longer needs old fields, then remove old fields.
RE MySQL I expect that its a hard bit of code for a
diminishingly
small number of users (qv most mediawiki updates only targeted at core
audience). After all alter table on a few hundred thousand record
table is still quite quick. Maybe the later versions are better at
this?
They aren't. Falcon, for MySQL 6.0, is supposed to support online
alter table, but I don't think implementation on that feature has
started yet, and the feature list is subject to change this early on.
It's really a big pain for anyone who has a decent-sized database. So
I figure if it weren't hard to do, it would be in the software
already.