Toolserver-l February 2013

toolserver-l@lists.wikimedia.org

36 participants
34 discussions

by Dr. Trigon

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello! There were 2 messages here during Janury reporting problems with cron. I am now noticing issues with my cronjobs too. By looking at [1] you are able to see that the strange behaviour started somewhen week 2 and 3 (mid January). Do we have again cron (the server) running out of memory or what is the issue here? DaB can you may be give some hints here? Or someone else? [1] http://munin.toolserver.org/Login/hawthorn/cron_jobs_sh.html Thanks a lot and greetings! DrTrigon -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlEO31oACgkQAXWvBxzBrDDGugCg1jb5AWvJJfNlBJjebcfOA2cr tPAAoNtSSs6auQFkp1unbFpEv+Zi07Zu =u2tk -----END PGP SIGNATURE-----

11 years, 2 months

Postmortem: Partial Toolserver-outage

by DaB.

Hello all, great parts of the toolserver-cluster were down or very slow in the last few hours. AFAIS it was a problem with the user-store or rosemary (where the user- store is physically connected). I rebooted rosemary, but the reboot showed problems with its IPv6-address. I tried to fix that what caused several other reboots. Rosemary is now up and running but the user-store is not available (looks like Nosy just mounted it without updating the fstab-file). So I was forced to remove the user-store everywhere (beside on willow because it need a reboot to do that and a reboot is scheduled already later for today). I will try if I can find the partition for user-store and mount it but I have not much hope (there are way to many devices to try) – just to be clear: There is no data lost. Also away will be munin, because its data is also mounted on that host. I fear that we have to wait for Nosy to recover before we get the user-store back. tl;dr: TS had problems, user-store is away. Sincerely, DaB. -- Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885

11 years, 2 months

Reboot of willow Monday

by DaB.

Hello all, while I was killing some bot-processes on willow to reduce the high load I accidentally pressed return too early and killed a random number of processes with that. I restarted the system-processes, but I am not sure if everything is completely right. Just to be sure I hereby announce a reboot for tomorrow, Monday, 19:05 UTC. Willow will be away for some minutes. Please notice that the history shows that cron on solaris does not start all processes during the reboot, so you should check after the reboot if everything works. Please notice that in a few minutes the new "no bots without SGE"-rule ([1]) becomes active, so please make sure that your bot uses SGE or I might disable it. I have no idea how many user-processes were killed, but I'm sorry that it did happen nevertheless. Sincerely, DaB. [1] http://lists.wikimedia.org/pipermail/toolserver-announce/2013- January/000557.html -- Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885

11 years, 2 months

Save the date: Amsterdam Hackathon 2013 (May 24-26)

by Maarten Dammers

Hi everyone, Wikimedia Nederland invites all developers to the Amsterdam Hackathon 2013. The hackathon is an opportunity for all Wikimedia community developers and sysadmins to come together, squash bugs and write great new features & tools. Unlike the previous years (2012, 2011, etc.) this Hackathon won't be in Berlin, but in Amsterdam. The event is open to a wide range of developers. We welcome both seasoned and new developers as well as people working on MediaWiki, tools, pywikiedia, gadgets, extensions, templates … . It takes place from 24-26 May. If you’d like to attend, please save the date! There will not be an entrance fee for the event itself, but a registration is mandatory. There will be a limited number of scholarships available, details to be provided ASAP. We're currently finalizing the arrangements of the venue. When we're done with that we'll open registration. Keep an eye on https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013 for updates! Check this page for updates! Maarten Ps. Please spread the word

11 years, 2 months

Split of s2/s5 (cassia)

by DaB.

Hello all, for historical reasons s2 and s5 are together on one host (cassia). Because cassia is quite overloaded, the sharing will end soon and I will move s2 away. For this I need your help because s2 and s5 share also the user-databases and there is not hint which user-database is needed where. So if you use user-databases for joining with s2 (two!) please add the name of the user-database to [1] until Friday, 8. February 18:00 UTC. It will take only a few minutes to add your user-databases there, so please do it. If you do not your user-databases there your tools will break after the split, but of course that can be fixed later. Sincerely, DaB. [1] https://wiki.toolserver.org/view/User:Dab/s2-userdatabaes -- Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885

11 years, 2 months

Replication stopped

by DaB.

Hello all, around 3 o'clock UTC we lost connection to amaranth, our server in Tampa which handles the connection to the WMF-database-servers. Until now it is unclear if it is a server-problem or a connection-problem. I have tried to reach the wmf- techs, but no response yet. I will keep you updated by mail, because JIRA is also hosted at amaranth and so also down. Sincerely, DaB. -- Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885

11 years, 2 months

Adding more roots

by Tim Landscheidt

Hi, https://jira.toolserver.org/browse/TS-1599 says inter alia: | This is not the first time I try to open this ticket at | Jira. The last two times I tried, the message "You are not | authorized to perform this operation. Please try to log in | or sign up for an account. Close this dialog and press | refresh in your browser" appeared and the whole ticket was | gone and I got logged out automatically. wtf? It's really | disappointing that there seems to be no way to restore the | whole story I'd written. :-(( This is still the same issue as reported by Krinkle at http://permalink.gmane.org/gmane.org.wikimedia.toolserver/5506 in November. The backlog in JIRA for various other issues in the Toolserver project is quite impressive as well. So we should add more roots, ideally of course Solaris/Linux bilinguals with 20+ years of HA and MySQL replication expe- rience and lots of spare time on their hands, practically any bright mind who can track down some bug, update the pup- pet configuration and care for all the other tidbits while documenting their work meticulously, so that the roots can focus on the more complicated stuff. Silke, what are the requirements WMDE imposes on toolserver admins? Being of legal age in their country of citizenship, residence and Germany? Disclosing their identity to WMDE? Anything else? Tim

11 years, 2 months

Sick

by Marlen Caemmerer

Hello, I have become sick as my baby has with high fever and it seems the rest of the family will join. Yesterday I was already in hospital with the baby but without me thinking about their treatments it wont work. She has a resistant bactery that the doctors wanted to treat with a antibiotic that will not work there - fortunatelly I forced the ambulant doctor to measure the bacerty some days ago. I personally feel fever too and we will now see if there is a second sort of bacerty or if its "just" one resistant. This means I am offline now or only seldomly online. If toolserver is offline please contact me via phone (short message will work too) - you can find the number in the wiki. Please keep in mind that willow and nightshade are quite overloaded and try to eliminate as much load as you can on your tools. Afais we have a lot of load from Iran - guys please keep an eye on that. What Tim wrote about another admin is exactly right. Cheers Marlen/nosy

11 years, 3 months

Mail forwarding not working

by Tim Landscheidt

Hi, for those of you not having seen TS-1553, mail forwarding seems to have stopped working. So if you haven't received the usual job reports that you were expecting, you might want to login to all servers and check if there is mail for you. You can query all servers by: | for SERVER in clematis hawthorn nightshade ortelius willow wolfsbane yarrow; do | ssh $USER(a)$SERVER.toolserver.org ls -l /var/mail/$USER | done replacing $USER with your username. Tim

11 years, 3 months

FYI: TS wikivoyage is missing a few pages

by Magnus Manske

https://jira.toolserver.org/browse/TS-1598

11 years, 3 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Toolserver-l February 2013