👏 👏 👏
Sounds good for things like OCR scans, and book generation, the latter
being pushed to external wmf cloud resources.
Thanks for your work in this space. Sounds as if this will give
extensions a lot more scope for interesting things.
-- billinghurst
------ Original Message ------
From: "Kunal Mehta" <legoktm(a)debian.org>
To: "wikitech-l" <wikitech-l(a)lists.wikimedia.org>
Sent: 6/10/2021 9:46:13 AM
Subject: [Wikitech-l] Score, Kubernetes and switching to Shellbox
Hi everyone,
tl;dr: External shell outs are now run via Shellbox. Any deployed code needs to use
Shellbox/BoxedCommand, and documentation is available to help migrate.
To safely re-enable Score (LilyPond) on Wikimedia wikis, we developed Shellbox, a way to
run shell commands in a remote, isolated container. This is (hopefully) a stronger level
of isolation than we previously had with firejail, since it's relying on Linux
containers and Kubernetes to do the isolation. At the same time, this helps us in moving
towards running MediaWiki on Kubernetes, as we don't want to include all these
external commands inside the MediaWiki container. For the most part, any new shelling out
to external commands needs to be done via Shellbox.
A lot of the design and rationale behind Shellbox is captured in the RfC:
<https://phabricator.wikimedia.org/T260330>.
In Wikimedia production, so far Score, Timeline, SyntaxHighlight and Wikidata constraint
regex checking are all using Shellbox. Details about that and links to dashboards are
available at <https://wikitech.wikimedia.org/wiki/Shellbox>. The main things that
are left are media-handling code that extracts metadata: DjVu, PdfHandler and
PagedTiffHandler, which is tracked at <https://phabricator.wikimedia.org/T289228>,
and videoscaling (TimedMediaHandler).
Some work has to be done in MediaWiki to make code compatible with Shellbox, specifically
switching to "BoxedCommand", which now has its own documentation page:
<https://www.mediawiki.org/wiki/Manual:BoxedCommand>. BoxedCommand works
transparently whether you have a separate Shellbox service set up or not. This is the
preferred way to write new shellouts going forward, though Shell::command() isn't
officially deprecated yet. So far all shellouts that are used in Wikimedia production have
already been converted except for TimedMediaHandler.
Looking forward, I think this also gives us a lot of flexibility in using more external
commands in the future. First, we're less tied to whatever OS version MediaWiki is
running on, as long as it can be built/shipped in a container, we can use it. And
secondly, it's probably OK if external commands aren't super well behaved (e.g.
use too much memory) since they're no longer sharing the same resources as an
appserver (this shouldn't be interpreted as a free pass for super inefficient stuff of
course).
I tried to keep this summary short, and am intending to write a longer blog post that
explains some more history in detail. But if you have any questions or something isn't
clear, please ask!
-- Kunal