Hello,
as some of you might have noticed s7 is badly corrupted.
Before I go into the details I can say we will most likely have to resetup s7 due to a innodb failure and I need to find a workaround until we have the data in place which can take days or even weeks.
Now the details - probably some of you with database knowledge might have an idea or say something to my idea of a workaround.
The replication failed several times in the past days and I did not know why. I simply skipped the slave query when it failed and then replication ran again. Today the database process restarted without a slave query failing repeatedly.
I had a close look and came to the idea that a broken transaction in the transaction log made it break. So I stopped the transaction from being played into the mysql db when the database restarts by setting innodb_force_recovery = 3.
Ok, fine. MySQL then starts. Cool. But the slave process wont run in this mode so we dont have the new data. Hm. So I tried to throw away the broken transaction. I moved the iblog-files and started mysql again.
MySQL failed to come up then telling me:
121116 11:40:28 InnoDB: Error: page 7 log sequence number 270 492619208 InnoDB: is in the future! Current system log sequence number 268 2967383564. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.1/en/forcing-innodb-recovery.html
Ok. MySQL does not come up then and repeatedly restarts. No luck. Copied the log files back. Fine. Works again.
Now I tried several thing to check which table might be corrupted. Innodbchecksum reported everything fine. Mysqlcheck crashed the mysql daemon when accessing centralauth.localnames. Oh? Why? Checking the table again crashes mysql. Hm. Tried a repair table - "storage engine does not support this"...hm.
The log says InnoDB: Page lsn 268 3672100478, low 4 bytes of lsn at page end 3672100478 InnoDB: Page number (if stored to page already) 192520, InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 428 InnoDB: Page may be an index page where index id is 0 1174 InnoDB: (index "PRIMARY" of table "centralauth"."localnames") InnoDB: Error in page 192520 of index "PRIMARY" of table "centralauth"."localnames" 121116 13:02:46 - mysqld got signal 11 ;
I tried to remove the index to rebuild it but this does not work due to innodb_force_recovery = 3.
Mysqldump fails - crashes the mysql daemon too.
So I dont have any more idea how to fix this error.
Now I thought if we have to resetup I could drop the table completely and start mysql normal mode so replication works again. This would only mean s7 would lack this table until it is resetup.
What do you think about this? Any more ideas?
Cheers Marlen/nosy
On 16/11/12 21:44, Marlen Caemmerer wrote:
Now I tried several thing to check which table might be corrupted. Innodbchecksum reported everything fine. Mysqlcheck crashed the mysql daemon when accessing centralauth.localnames. Oh? Why? Checking the table again crashes mysql. Hm. Tried a repair table - "storage engine does not support this"...hm.
The log says InnoDB: Page lsn 268 3672100478, low 4 bytes of lsn at page end 3672100478 InnoDB: Page number (if stored to page already) 192520, InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 428 InnoDB: Page may be an index page where index id is 0 1174 InnoDB: (index "PRIMARY" of table "centralauth"."localnames") InnoDB: Error in page 192520 of index "PRIMARY" of table "centralauth"."localnames" 121116 13:02:46 - mysqld got signal 11 ;
I tried to remove the index to rebuild it but this does not work due to innodb_force_recovery = 3.
You may need to set innodb_force_recovery back to 0 before being able to drop the index (we got lucky there if it's just the index).
http://dev.mysql.com/doc/refman/5.1/en/forcing-innodb-recovery.html:
The database must not otherwise be used with any nonzero value of innodb_force_recovery. As a safety measure, InnoDB prevents users from performing INSERT, UPDATE, or DELETE operations when
innodb_force_recovery
is greater than 0.
(I would have expected DROP INDEX to work, as DROP or CREATE tables are allowed in innodb_force_recovery, but seems it's not)
Mysqldump fails - crashes the mysql daemon too.
I expect it only crashes when dumping that table :)
You may be able to get the values out of the table by forcing usage of the other index (ln_name, ln_wiki). Although it could be safer to get a new dump for that able from wmf, or a mixture of them (such as getting a dump at both sides, then rsyncing).
Hello,
On Fri, 16 Nov 2012, Marlen Caemmerer wrote:
as some of you might have noticed s7 is badly corrupted.
I finally redumped the s7 instance into a new s7. Unfortunatelly it still has no centralauth.localnames. We are still waiting for the WMF to send us a copy of s7 but for now I could start the replication again.
This will catch up in the next days and your dbs are at least writable again.
If anything is missing still please tell me I still have all the data.
DaB, please be so kind to restart the replication of wikidata.
Cheers nosy
s7 still miss most of the databases. (I see only the information schema).
When s7 database is planned to be back with its data?
Eran
On Wed, Dec 5, 2012 at 2:26 PM, Marlen Caemmerer < marlen.caemmerer@wikimedia.de> wrote:
Hello,
On Fri, 16 Nov 2012, Marlen Caemmerer wrote:
as some of you might have noticed s7 is badly corrupted.
I finally redumped the s7 instance into a new s7. Unfortunatelly it still has no centralauth.localnames. We are still waiting for the WMF to send us a copy of s7 but for now I could start the replication again.
This will catch up in the next days and your dbs are at least writable again.
If anything is missing still please tell me I still have all the data.
DaB, please be so kind to restart the replication of wikidata.
Cheers
nosy
______________________________**_________________ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.**orgToolserver-l@lists.wikimedia.org ) https://lists.wikimedia.org/**mailman/listinfo/toolserver-lhttps://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/** view/Mailing_list_etiquettehttps://wiki.toolserver.org/view/Mailing_list_etiquette
Oh no... I forgot to create the views that make you see the databases. Everything should be back to normal now.
On Thu, 6 Dec 2012, Eran Rosenthal wrote:
Date: Thu, 6 Dec 2012 18:52:33 From: Eran Rosenthal eranroz89@gmail.com To: Wikimedia Toolserver toolserver-l@lists.wikimedia.org Cc: dab@dabpunkt.eu, marlen.caemmerer@wikimedia.de Subject: Re: [Toolserver-l] Defective database s7
s7 still miss most of the databases. (I see only the information schema).
When s7 database is planned to be back with its data?
Eran
On Wed, Dec 5, 2012 at 2:26 PM, Marlen Caemmerer < marlen.caemmerer@wikimedia.de> wrote:
Hello,
On Fri, 16 Nov 2012, Marlen Caemmerer wrote:
as some of you might have noticed s7 is badly corrupted.
I finally redumped the s7 instance into a new s7. Unfortunatelly it still has no centralauth.localnames. We are still waiting for the WMF to send us a copy of s7 but for now I could start the replication again.
This will catch up in the next days and your dbs are at least writable again.
If anything is missing still please tell me I still have all the data.
DaB, please be so kind to restart the replication of wikidata.
Cheers
nosy
______________________________**_________________ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.**orgToolserver-l@lists.wikimedia.org ) https://lists.wikimedia.org/**mailman/listinfo/toolserver-lhttps://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/** view/Mailing_list_etiquettehttps://wiki.toolserver.org/view/Mailing_list_etiquette
The tables/views are not yet accessible, and I can see only the information schema. (maybe because of missing permissions?)
On Thu, Dec 6, 2012 at 8:06 PM, Marlen Caemmerer < marlen.caemmerer@wikimedia.de> wrote:
Oh no... I forgot to create the views that make you see the databases. Everything should be back to normal now.
On Thu, 6 Dec 2012, Eran Rosenthal wrote:
Date: Thu, 6 Dec 2012 18:52:33
From: Eran Rosenthal eranroz89@gmail.com To: Wikimedia Toolserver <toolserver-l@lists.wikimedia.**orgtoolserver-l@lists.wikimedia.org
Cc: dab@dabpunkt.eu, marlen.caemmerer@wikimedia.de Subject: Re: [Toolserver-l] Defective database s7
s7 still miss most of the databases. (I see only the information schema).
When s7 database is planned to be back with its data?
Eran
On Wed, Dec 5, 2012 at 2:26 PM, Marlen Caemmerer < marlen.caemmerer@wikimedia.de> wrote:
Hello,
On Fri, 16 Nov 2012, Marlen Caemmerer wrote:
as some of you might have noticed s7 is badly corrupted.
I finally redumped the s7 instance into a new s7. Unfortunatelly it
still has no centralauth.localnames. We are still waiting for the WMF to send us a copy of s7 but for now I could start the replication again.
This will catch up in the next days and your dbs are at least writable again.
If anything is missing still please tell me I still have all the data.
DaB, please be so kind to restart the replication of wikidata.
Cheers
nosy
______________________________****_________________ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.****org< Toolserver-l@lists.**wikimedia.org Toolserver-l@lists.wikimedia.org> ) https://lists.wikimedia.org/****mailman/listinfo/toolserver-lhttps://lists.wikimedia.org/**mailman/listinfo/toolserver-l <**https://lists.wikimedia.org/**mailman/listinfo/toolserver-lhttps://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/** view/Mailing_list_etiquette<ht**tps://wiki.toolserver.org/** view/Mailing_list_etiquettehttps://wiki.toolserver.org/view/Mailing_list_etiquette
______________________________**_________________ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.**orgToolserver-l@lists.wikimedia.org ) https://lists.wikimedia.org/**mailman/listinfo/toolserver-lhttps://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/** view/Mailing_list_etiquettehttps://wiki.toolserver.org/view/Mailing_list_etiquette
Hello, At Thursday 06 December 2012 22:08:57 DaB. wrote:
DaB, please be so kind to restart the replication of wikidata.
the virtual-mysql-instances (z-dat*) have no wikidata because of space. However z7 looks it has enough space and huwiki will be one of the first wikis with wikidata-integration AFAIK so I imported wikidata there and started replication.
Sincerely, DaB.
toolserver-l@lists.wikimedia.org