-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Brion Vibber wrote:
I'm currently doing a test run of pulling migration data from the user tables; the slow part currently seems to be getting the edit counts.
A couple quick stats:
Migration pass 0 (collecting user data for each wiki) takes about 2 hours. On the live run this may require locking out new user registrations temporarily.
Migration pass 1 (zipping through combined account data, trying to automatically merge) takes about 5 hours. It should be possible to run these on-demand also, so new registrations and logins during the batch process are possible.
The pass-1 checks are without user interaction, so cannot do same-password verification; thus the number of remaining conflicts remains artificially high before interactive logins have a chance to merge more accounts.
| NULL | 134242 | | primary | 4457075 | | empty | 373044 | | mail | 205084 |
4,457,075 primary accounts (most edits for that username) 373,044 non-primary accounts auto-merged as unused 205,084 non-primary accounts auto-merged due to matching e-mail 134,242 non-primary accounts left unmerged at this stage
71,741 unique usernames have accounts left unmerged.
Migration pass 2 involves the interactive checks when you log in, so can also merge accounts with matching passwords.
In a day or two I'll set up test.wikipedia.org for interactive migration testing. This will allow people to try out the UI, and give a chance to see how it looks so we can make sure the UI translations get done soon...
Will try fiddling with this tonight...
- -- brion vibber (brion @ pobox.com)