Hi,
I'm trying go through some issues on JIRA and it keeps logging me out every few minutes.
At some point I even logged in, clicked an issue, clicked Edit (which uses AJAX) and then the Edit screen wouldn't load due to me not being authenticated (while I still saw my nickname on the top right).
-- Krinkle
Hello all,
Dispenser messaged me because the query
explain select * from enwiki_p.revision limit 1;
isn't working anymore. As far as I see that's caused by the recent mysql-
update. We need to patch this but it may take a few days and another mysql-
restart (will be announced separately).
You can follow the progress at [1].
Just to let you known.
Sincerely,
DaB.
[1] https://jira.toolserver.org/browse/TS-1585
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
Hello,
we will update the Solaris systems between 8pm and 10pm UTC on Sunday.
The servers (ortelius, willow, clematis, hawthorn, wolfsbane, damiana, turnera, ptolemy) may also be rebooted if recommended by the update.
Kind regards
Marlen / nosy
Hello,
MySQL needs a security update and we had none anyway for a while.
This means we will update MySQL on Friday 8pm - 10 pm UTC.
During this time the databases will restart at least once.
Kind regards
Marlen/nosy
Hello,
I will move the s6 database instance to a SAN volume on
9th Dec 9pm UTC - 11pm UTC
It is quite well possible that the move vill take an hour or probably more.
Kind regards
Marlen
Hello,
as some of you might have noticed s7 is badly corrupted.
Before I go into the details I can say we will most likely have to resetup s7 due to a innodb failure and I need to find a workaround until we have the data in place which can take days or even weeks.
Now the details - probably some of you with database knowledge might have an idea or say something to my idea of a workaround.
The replication failed several times in the past days and I did not know why.
I simply skipped the slave query when it failed and then replication ran again.
Today the database process restarted without a slave query failing repeatedly.
I had a close look and came to the idea that a broken transaction in the transaction log made it break.
So I stopped the transaction from being played into the mysql db when the database restarts by setting innodb_force_recovery = 3.
Ok, fine. MySQL then starts. Cool. But the slave process wont run in this mode so we dont have the new data. Hm.
So I tried to throw away the broken transaction.
I moved the iblog-files and started mysql again.
MySQL failed to come up then telling me:
121116 11:40:28 InnoDB: Error: page 7 log sequence number 270 492619208
InnoDB: is in the future! Current system log sequence number 268 2967383564.
InnoDB: Your database may be corrupt or you may have copied the InnoDB
InnoDB: tablespace but not the InnoDB log files. See
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/forcing-innodb-recovery.html
Ok. MySQL does not come up then and repeatedly restarts. No luck. Copied the log files back. Fine. Works again.
Now I tried several thing to check which table might be corrupted.
Innodbchecksum reported everything fine.
Mysqlcheck crashed the mysql daemon when accessing centralauth.localnames.
Oh? Why? Checking the table again crashes mysql. Hm.
Tried a repair table - "storage engine does not support this"...hm.
The log says
InnoDB: Page lsn 268 3672100478, low 4 bytes of lsn at page end 3672100478
InnoDB: Page number (if stored to page already) 192520,
InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 428
InnoDB: Page may be an index page where index id is 0 1174
InnoDB: (index "PRIMARY" of table "centralauth"."localnames")
InnoDB: Error in page 192520 of index "PRIMARY" of table "centralauth"."localnames"
121116 13:02:46 - mysqld got signal 11 ;
I tried to remove the index to rebuild it but this does not work due to innodb_force_recovery = 3.
Mysqldump fails - crashes the mysql daemon too.
So I dont have any more idea how to fix this error.
Now I thought if we have to resetup I could drop the table completely and start mysql normal mode so replication works again.
This would only mean s7 would lack this table until it is resetup.
What do you think about this?
Any more ideas?
Cheers
Marlen/nosy
I've noticed that one of SuggestBot's hourly jobs has stalled for the past
7 hours, stuck in the "qw" state. Usually it runs like clockwork. Is there
a problem with the SGE queues?
Regards,
Morten
Hello all,
I just got back from the general member meeting of Wikimedia Deutschland. As
you know I requested a decision about the future of the toolserver there. To
make it short: It doesn't went as well as I hoped. While the request itself
was accepted, it was changed in some important parts.
The main fear was that WMF could stop to provide us with fresh dumps and/or
replication in near future, making the toolserver more or less useless.
Although I learned from a participating WMF-board-member that no such board-
decision exists.
My request was changed in the following way: The WMF has to tell WMDE within 6
months how Wikilabs can replace the toolserver in the promised complete way.
If the answer is not satisfying, WMDE will develop a "Governance-Model" to
ensure the continuation of the toolserver. Different groups are invited into
this "Governance-Model" and it should be done until the end of 2013.
That sounds good on the first view, but there are 2 loop-holes: Nobody defined
what "complete" or "satisfying" is. In my eyes Wikilabs can not replace the
toolserver complete (in the way that all tools can move to there) and so the
answer can only be unsatisfying, but that's just a question of definition I
guess.
A second change was that the investment for the toolserver will be restricted
to the "necessary". While that is of course a matter of definition again I'm
sure that means "no new hardware if it is possible in any way".
To summarize this: In the best case we have to wait for 6 months until WMDE
officially learns that Wikilabs can not replace us, than wait for another 6
months until they will create their "Governance-Model" and in 2014 we get new
hardware.
In worst case we wait for 6 months and than WMDE and WMF agree that everything
is ok and we will never get any new hardware and somewhen the TS will shut
down (of course with the remaining tools that can not migrated to Wikilabs).
I can not imagine ways between both cases, but I'm sure they exists. In any
way we will get no (or nearly no) new hardware in 2013 – so we have to life
with that.
A good news is that the toolserver will get 3 new database-servers soon.
I have not decided yet if I will remain as root under this circumstances for
2013 – I will tell you my decision until next Sunday.
For now I will head to bed because I'm exhausted and disappointed. See you
tomorrow.
Sincerely,
DaB.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885