Hi Taavi, thanks for your reply, it’s super helpful! I’ll give Cloud VPS a try.

> Documentation about the Ceph cluster powering Cloud VPS is on a separate Wikitech page:
> <https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Ceph>.

Curious, if Cloud VPS already has a working Ceph cluster, might it be possible to run Ceph’s built-in object store?

— Sascha

Am Mi., 22. Dez. 2021 um 18:41 Uhr schrieb Taavi Väänänen <hi@taavi.wtf>:

On 12/22/21 18:29, Sascha Brawer wrote:
> What storage options does the Wikimedia cloud have? Can external
> developers (i.e. people not employed by the Wikimedia foundation) write
> to Cinder and/or Swift? Either from Toolforge or from Cloud VPS?

I've left more detailed replies inline. tl;dr: Currently Toolforge
doesn't really have any other options than NFS. Cloud VPS additionally
gives you the option to use Cinder (extra volumes you can attach to a VM
and move from a VM to another).

> See below for context. (Actually, is this the right list, or should I
> ask elsewhere?)
> For Wikidata QRank [https://qrank.toolforge.org/
> <https://qrank.toolforge.org/>], I run a cronjob on the toolforge
> Kubernetes cluster. The cronjob mainly works on Wikidata dumps and
> anonymized Wikimedia access logs, which it reads from the NFS-mounted
>   /public/dumps/public directory. Currently, the job produces 40
> internal files with a total size of 21G; these files need to be
> preserved between individual cronjob runs. (In a forthcoming version of
> the cronjob, this will grow to ~200 files with a total size of ~40G).
> For storing these intermediate files, Cinder might be a good solution.
> However, afaik Cinder isn’t available on Toolforge. Therefore, I’m
> currently storing the intermediate files in the account’s home directory
> on NFS. Presumably (but not sure, but speculating because I’ve seen NFS
> crumbling elsewhere) Wikimedia’s NFS server will be easily overloaded;
> in any case, Wikimedia’s NFS server seems to protect itself by
> throttling access. Because of the throttling, the cronjob is slow when
> working with its intermediate files.
> * Will Cinder be made available to Toolforge users? When?

We're interested in it, but no-one has time or interest to work on
making it a reality yet. This is tracked on Phabricator:

As a reminder: if anyone is interested in working on this or other parts
of the WMCS infrastructure, please talk to us!

> * Or should I move from Toolforge to Cloud-VPS, so I can store my
> intermediate files on Cinder?

~40G is in the range where Cinder/Cloud VPS might indeed be a better
solution than NFS. While we don't currently have any official numbers on
what is acceptable on NFS and what's not, for context the Toolforge
project NFS cluster has currently about 8T of storage for about 3,000 tools.

> * Or should I store my intermediate files in some object storage? Swift?
> Ceph? Something else?

WMCS currently doesn't offer direct access to any object storage
service. This is something we're likely to work on in the mid-term (next
6-12 months is the last estimate I've heard). This project is currently
stalled on some network design work:

> * Is access to Cinder and Swift subject to the same throttling as
> NFS? Or will moving away from NFS increase the available I/O throughput?

No, NFS is subject to completely separate throttling and Ceph-backed
storage methods (local VM disks and Cinder volumes) have much higher
amount of bandwidth available.

> The final output of the QRank system is a single file, currently ~100M
> in size but eventually growing to ~1G. When the cronjob has computed a
> fresh version of its output, it deletes any old outputs from previous
> runs (with the exception of the previous last two versions, which are
> kept around internally for debugging). Typical users are other bots or
> external pipelines who need a signal for prioritizing Wikidata entities,
> not end users on the web. Users typically check for updates with HTTP
> HEAD, or with conditional HTTP GET requests (using the standard
> If-Modified-Since and If-None-Match headers). Currently, I’m serving the
> output file with a custom-written HTTP server that runs as a web service
> on Toolforge behind Toolforge’s nginx instance. My server reads its
> content from the NFS-mounted home directory that’s getting populated by
> the cronjob. Now, it’s not exactly a great idea to serve large data
> files from NFS, but afaik it’s the only option available in the
> Wikimedia cloud, at least for Toolforge users. Of course I might be wrong.

> * Should I move from Toolforge to Cloud-VPS, so I can serve my final
> output files from Cinder instead of NFS?
> * Or should I rather store my final output files in some object storage?
> Swift? Ceph? Something else?
> * Or is NFS just fine, even if the size of my data grows from 100M to 1G+?

When we offer object storage, yes, storing your files in it is a good
idea. I think you should be fine NFS for now (please don't quote me on
that). Cloud VPS is an option too if you prefer it.

> The cronjob also uses ~5G of temporary files in /tmp, which it deletes
> towards the end of each run. The temp files are used for external
> sorting, so all access is sequenyoutial. I’m not sure where these temporary
> files currently sit when running on Toolforge Kubernetes. Given their
> volume, I presume that the tmpfs of the Kubernetes nodes will eventually
> run out of memory and then fall back to disk, but I wouldn’t know how to
> find this out. _If_ the backing store disk for tmpfs eventually ends up
> being mounted on NFS, it sounds wasteful for the poor NFS
> server;, especially since the files get deleted at job completion. In
> that case, I’d love to save common resources by using a local disk. (It
> doesn’t have to be an SSD; a spinning hard drive would be fine, given
> the sequential access pattern). But I’m not sure how to set this up on
> Toolforge Kubernetes, and I couldn’t find docs on wikitech. Actually,
> this might be a micro-optimization, so perhaps not worth the trouble.
> But then, I’d like to be nice with the precious shared resources in the
> Wikimedia cloud.

Good question, I'm not sure either if tmpfs for Kubernetes containers is
on Ceph (SSDs) or on RAM. At least it's not on NFS.

> Sorry that I couldn’t find the answers online. While searching, I came
> across the following pointers:
> – https://wikitech.wikimedia.org/wiki/Ceph
> <https://wikitech.wikimedia.org/wiki/Ceph>: This page has a warning that
> it’s probably “no longer true”. If the warning is correct, perhaps
> the page could be deleted entirely? Or maybe it could link to the
> current docs?
> – https://wikitech.wikimedia.org/wiki/Swift
> <https://wikitech.wikimedia.org/wiki/Swift>: This sounds perfect, but
> the page doesn’t mention how the files are getting populated, what the
> ACLs are managed, and if Wikimedia’s Swift cluster is even accessible to
> external developers.
> – https://wikitech.wikimedia.org/wiki/Media_storage
> <https://wikitech.wikimedia.org/wiki/Media_storage>: This seems
> current (I guess?), but the page doesn’t mention if/how external
> Toolforge/Cloud-VPS users may upload objects, or if this is just for the
> current users.

Those pages document the media storage systems used to store uploads for
the production MediaWiki projects (Wikipedia and friends). Those are not
accessible from WMCS and should be treated as completely separate
systems, and any future WMCS (object) storage services will not use them.

Documentation about the Ceph cluster powering Cloud VPS is on a separate
Wikitech page:

Taavi (User:Majavah)
volunteer Toolforge/Cloud VPS admin