Hi all,
Does anyone know when we will be upgrading the Clouds replica DBs to
MariaDB 10.2? I am asking mainly because we are on 10.1.33, 10.3 is now
out, and since 10.2 support for CTEs (WITH statements) has been added which
is very handy.
Thanks,
Huji
During the flurry of activity we had recently in diagnosing and fixing
problems with the shared ToolsDB MariaDB service [0], we made a
configuration change to place a hard limit on the maximum number of
simultaneous connections permitted for each user account [1][2].
The current limit is set at 20 concurrent connections. This should not
cause any problems for a typical webservice or single script using
ToolsDB, but tools making heavy use of ToolsDB may need to make some
adjustments.
As always, tool maintainers can seek advice on dealing with this limit
or other issues in Toolforge from the Toolforge administration team
and others in the community via our Freenode IRC channel
(#wikimedia-cloud), Phabricator tasks, and the
cloud(a)lists.wikimedia.org mailing list.
[0]: https://phabricator.wikimedia.org/T216208
[1]: https://phabricator.wikimedia.org/T216170
[2]: https://mariadb.com/kb/en/library/server-system-variables/#max_user_connect…
Bryan, on behalf of the Toolforge administration team
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Technical Engagement Boise, ID USA
irc: bd808 v:415.839.6885 x6855
_______________________________________________
Wikimedia Cloud Services announce mailing list
Cloud-announce(a)lists.wikimedia.org (formerly labs-announce(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
Hello,
Unfortunately, we had some disk space issues in our NFS server that
required the NFS process to be restarted. This created all sorts of
issues in Toolforge and Cloud VPS which we are still recovering from.
If your tool is not working and wasn't restarted automatically, please
try to restart it manually (`webservice restart`).
We're truly sorry for the inconvenience. If you need help with a
particular tool, please contact us on #wikimedia-cloud
Regards,
--
Giovanni Tirloni
Operations Engineer
Wikimedia Foundation
Reminder: Technical Advice IRC meeting this week **(Wednesday) 4-5 pm UTC**
on #wikimedia-tech.
Question can be asked in English, Bulgarian & Hungarian.
The Technical Advice IRC Meeting is a weekly support event for volunteer
developers. Every Wednesday, two full-time developers are available to help
you with all your questions about Mediawiki, gadgets, tools and more! This
can be anything from "how to get started" over "who would be the best
contact for X" to specific questions on your project.
If you know already what you would like to discuss or ask, please add your
topic to the next meeting:
https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting
Hope to see you there!
Michi (for the Technical Advice IRC Meeting crew)
--
Michael F. Schönitzer
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Tel. (030) 219 158 26-0
https://wikimedia.de
Unsere Vision ist eine Welt, in der alle Menschen am Wissens der Menschheit
teilhaben, es nutzen und mehren können. Helfen Sie uns dabei!
https://spenden.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
This is an update on the ongoing problems with the toolsdb service. We are preparing to move to a new server, which is now a functioning replica of the toolsdb server. The first step here is to restart the service in read-only mode, and then we will move the DNS. Expect writes to stop working and connections to drop. When we are moved to the new DNS, services that use this database will need to be restarted.
This will be happening within the next hour unless it is slowed down by some issues or caution.
Brooke Storm
Operations Engineer
Wikimedia Cloud Services
bstorm(a)wikimedia.org <mailto:bstorm@wikimedia.org>
IRC: bstorm_
_______________________________________________
Wikimedia Cloud Services announce mailing list
Cloud-announce(a)lists.wikimedia.org (formerly labs-announce(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
Hi,
Here is just a brief update on the status of Toolforge and CloudVPS by today
2019-02-16, along with some guess-estimations and what to expect in following
days. Keeping track of all the events we had this week may be complex, because
they were several of them, and heavily intermixed.
* CloudVPS suffered severe hardware issues this week [0]. We solved most of the
problems and added spare hardware [1] because our server capacity was really
lowered. This service should be mostly stable right now.
* Toolsdb (tools.db.svc.eqiad.wmflabs) is currently overloaded and suffering
from hardware errors. We are already working on a replacement for this service
[2]. Services depending on this database aren't working properly (like PAWS) and
Toolforge tools that use it are also affected.
An honest estimation is that services (specially Toolsdb) we won't be fully
recovered until at least next Tuesday (2019-02-26).
Our current plans involve replacing the Toolsdb hardware with virtual machines
inside CloudVPS [3]. We are trying to be extra cautious to prevent data loss and
other problems usually associated with doing things in a rush.
Finally, I would like to mention that we are all well aware of the importance of
these services for the community and we are doing our best to get things fixed.
Thanks for your understanding and patience.
regards
[0] https://wikitech.wikimedia.org/wiki/Incident_documentation/20190213-cloudvps
[1] CloudVPS: drain and rebuild labvirt1009 as cloudvirt1009
https://phabricator.wikimedia.org/T216239
[2] ToolsDB overload and cleanup https://phabricator.wikimedia.org/T216208
[3] Replace labsdb100[4567] with instances on cloudvirt1019 and cloudvirt1020
https://phabricator.wikimedia.org/T193264
--
Arturo Borrero Gonzalez
Operations Engineer / Wikimedia Cloud Services
Wikimedia Foundation
Because bad things come in threes (I'm hoping it's threes and not
sevens) the server that hosts toolsdb is now also misbehaving. Brooke
just now disabled a troubled drive which may have resolved things, but
if the last few hours are any indication then the vast majority of
connection or query attempts are likely to fail until we have a better
solution in place.
We're working on multiple fronts, trying to diagnose and fix the primary
issue while also working to get new hardware online as a possible
replacement server. Neither of those things are likely to get done
until tomorrow, though, so toolforge will be in pretty bad shape in the
meantime.
It has been a rough couple of days, but rest assured we're taking notes
about how to prevent outages like these in the future. Thank you for
your patience in the meantime!
-Andrew + the cloud team
_______________________________________________
Wikimedia Cloud Services announce mailing list
Cloud-announce(a)lists.wikimedia.org (formerly labs-announce(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
Hi all,
I have long been managing some of my tools using non-interactive
provisioning scripts − historically using Shell [1], and increasingly
moving towards Ansible playbooks [2] [3].
Both methods boil down to:
* SSH onto bastion host
* `become tool`
* Execute steps: git pull, install dependencies,etc.
I have not always been able to fulfill my 'non-interactive' requirement.
For my projects which require Node dependencies, I did have to manually
drop into a shell (webservice --backend=kubernetes nodejs shell) in order
to run npm.
(I also had a bit of a hard time when trying to manage crontabs using
Ansible, as the `crontab` executable override seems to be doing all kinds
of magic − I kind of tinkered until I reached a "looks like it works!"
point ^_^)
As I was reading through the Trusty migration docs [4], it is somewhat
hinted that virtualens (for Python dependencies) should also be installed
in an interactive container shell, and not from the Bastion host [5].
Can someone help me with the following questions?
* Is it appropriate to create python virtual envs from the Bastion host?
* Is there a recommended way to execute inside a kubernetes container
remotely / in a non-interactive fashion (eg using a tool like Ansible)?
* in general, am I doing something fundamentally at odds with the Toolforge
environment with such configuration management?
Thanks!
[1]
https://phabricator.wikimedia.org/diffusion/THER/browse/master/bin/build-py…
[2]
https://phabricator.wikimedia.org/source/tool-wikiloves/browse/master/deplo…
[3]
https://github.com/JeanFred/universalviewer-toolforge/blob/master/deploy/ma…
[4]
https://wikitech.wikimedia.org/wiki/News/Toolforge_Trusty_deprecation#Rebui…
[5] https://phabricator.wikimedia.org/T214086#4890276
--
Jean-Fred
Today we have deployed an updated version of the webservicemonitor
service that we use to help ensure that `webservice
--backend=gridengine ...` processes are actively running on the job
grid. The main change in this new version is that we have implemented
tracking of the timestamp of past restart attempts for each tool and a
restart rate limit. The initial limit we have set for this is 3
restarts per 60 minute sliding window.
This change will not stop a tool maintainer from running `webservice
restart` manually. You can read more of the reasoning behind the
change at <https://phabricator.wikimedia.org/T107878>.
Bryan
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Technical Engagement Boise, ID USA
irc: bd808 v:415.839.6885 x6855
_______________________________________________
Wikimedia Cloud Services announce mailing list
Cloud-announce(a)lists.wikimedia.org (formerly labs-announce(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce