On Sun, Sep 26, 2004 at 01:35:25PM -0700, Brion Vibber wrote:
A possible alternative which has been brought up before is to avoid
changing the structure of the old table, rather pullings its
non-textual data out to a separate table and continuing to use the old
table (unaltered) as a store for old_text. This avoids copying around
the most data and should in theory be faster.
Quick benchmark on Diderot, using dewiki (one third the size of enwiki):
INSERT INTO new_old (old_id, old_namespace, old_title, old_comment,
old_user, old_user_text, old_timestamp, old_minor_edit, old_flags)
SELECT old_id, old_namespace, old_title, old_comment,
old_user, old_user_text, old_timestamp, old_minor_edit, old_flags
FROM old;
Query OK, 2498816 rows affected (24 min 52.30 sec)
Table was created by
create table new_old (
old_id int(8) unsigned not null primary key,
old_namespace tinyint(2) unsigned,
old_title varchar(255) binary,
old_comment tinyblob,
old_user int(5) unsigned,
old_user_text varchar(255) binary,
old_timestamp varchar(14) binary ,
old_minor_edit tinyint(1),
old_flags tinyblob );
Probably the query can be done while the wiki is online and the transition
requires only a short downtime to get the latest changes and to migrate to
the new software release.
Regards,
JeLuF
PS: Lies, damn lies, and benchmarks. Test setup was not representative.
Ariel is performing better than diderot, but has additional workload
at the same time.