InnoDB: Database page corruption on disk or a failed InnoDB: file read of page 1289395. InnoDB: You may have to recover from a backup. 031205 7:15:02 InnoDB: Page dump in ascii and hex (16384 bytes): len 16384; hex d40044c50013acb30013acb10013acb400000006f5bb4a7745bf00000000000000000000 00000002380500070 [snip some dumped data showing various snippets of wiki pages] 93.170.250.70]] 11:06,.t.e.0....Jw;InnoDB: End of page dump 031205 7:15:02 InnoDB: Page checksum 1585856272, prior-to-4.0.14-form checksum 2519793357 InnoDB: stored checksum 3556787397, prior-to-4.0.14-form stored checksum 2519793357 InnoDB: Page lsn 6 4122692215, low 4 bytes of lsn at page end 4122692215 InnoDB: Page may be an update undo log page InnoDB: Page may be an index page where index id is 0 47 InnoDB: and table enwiki/cur index cur_id InnoDB: Database page corruption on disk or a failed InnoDB: file read of page 1289395. InnoDB: You may have to recover from a backup. InnoDB: It is also possible that your operating InnoDB: system has corrupted its own file cache InnoDB: and rebooting your computer removes the InnoDB: error. InnoDB: If the corrupt page is an index page InnoDB: you can also try to fix the corruption InnoDB: by dumping, dropping, and reimporting InnoDB: the corrupt table. You can use CHECK InnoDB: TABLE to scan your table for corruption. InnoDB: Look also at section 6.1 of InnoDB: http://www.innodb.com/ibman.html about InnoDB: forcing recovery. InnoDB: Ending processing because of a corrupt database page.
It tried to restart itself several times with the same error, then gave up. A manual restart of mysql got it back on its feet, at least mostly...?
-- brion vibber (brion @ pobox.com)
On Dec 4, 2003, at 23:55, Brion Vibber wrote:
InnoDB: Database page corruption on disk or a failed InnoDB: file read of page 1289395. InnoDB: You may have to recover from a backup.
I'm still fighting with this. Docs recommend rebooting the machine to try to see if the operating system's buffer caches are corrupted and the file itself is fine; I haven't tried that (I don't even know for sure if the machine will come back if I try it, and we have no way to power cycle this machine remotely yet). I did try reading a huge portion of the innodb data file through memory to see if it would clear out the particular page, and this had no affect.
I tried the 'max' and 'standard' AMD64 binary builds of MySQL 4.0.16, and then compiled it from source. Switching these has no affect.
The DB will croak spontaneously from time to time, and I can consistently provoke it by running "check table cur" on enwiki.
I tried to see if I could pull together a fresh copy of everything by rebuilding the databases from the binlog, but this doesn't seem to include the statements to create databases...? or something? and I'm not sure what exactly is required to make such a rebuild happen cleanly.
However, every query that updated any data in the last three days _is_ in the binlog, so in the worst case it should be possible to rebuild it...
I'm going to try to bang on this for a whille longer...
-- brion vibber (brion @ pobox.com)
On Dec 5, 2003, at 02:21, Brion Vibber wrote:
I'm going to try to bang on this for a whille longer...
The corrupted page -- at least the one that I was always seeing ref'd in the log -- seems to have included only bits of enwiki.cur and dewiki.old. I restarted the server with innodb_force_recovery=1 set so it would skip over any corrupted rows, dumped both tables, then restarted again and read them back in. So far, so good.
I'm checking alll the tables and taking wikis out of read-only lock as they pass.
-- brion vibber (brion @ pobox.com)
On Dec 5, 2003, at 03:51, Brion Vibber wrote:
I'm checking alll the tables and taking wikis out of read-only lock as they pass.
There've been no further events noted in mysql's error log so far. The linkscc table on jawiki got left uncorrupted because 'check table' only _pointed out_ the error and didn't correct it, needing a 'repair table' to actually fix the problem.
The backup dumps I made last night look to be of reasonable size and nothing barfed while creating them. So, so far so good.
Nonetheless it's rather worrying that there was any data corruption to begin with!
-- brion vibber (brion @ pobox.com)
wikitech-l@lists.wikimedia.org