[Labs-l] Getting started with Wikimedia Labs, especially for bot authors

Fri Jul 20 21:25:10 UTC 2012

> * Wikimedia Labs does not have Toolserver-style database replication
> right now.  That is on the WMF to-do list but there is no predicted date
> in the roadmap at https://www.mediawiki.org/wiki/Wikimedia_Labs .  Given
> what TParis and Brad say below, that's a prerequisite to the success of
> Tool Labs.

This isn't accurate. It's stated directly in the engineering goals for the year:

https://www.mediawiki.org/wiki/Wikimedia_Engineering/2012-13_Goals#Wikimedia_Labs

> * It seems like Ops would be interested in having some bots run on the
> WMF cluster, after sufficient code and design review. (I infer this from
> the 5th step mentioned below and from my conversations with Daniel Zahn
> and Ryan Lane.)  I don't think there's currently a process for this sort
> of promotion -- I suppose it would work a little like
> https://www.mediawiki.org/wiki/Writing_an_extension_for_deployment .
>

This is not the case. We don't have plans to run bots on production.
We plan on providing more than enough resources to run bots reliably
in Labs, where they can be properly community maintained.

> Hi Sumana,
>
> Here is my report on the Labs discussion. It turns out that in a way we
> weren't thinking big enough, or in another way we were thinking too big.
> What Ops (as represented by Daniel Zahn) would prefer as far as
> Wikipedia-editing bots would be one Labs project managed by someone (or
> some group) who isn't Ops, and that person/group would take care of
> giving people accounts within that project to run bots. I guess the
> general model is vaguely along the lines of how the Toolserver is run by
> WMDE and they give accounts to people. I may have found myself a major
> project here. ;)
>

We have a bots project already. Petr Bena and some other folks already
maintain it as a community.

> But I did also get detail on how to get yourself a project on Labs, if
> you have reason to, and some clarification on how the whole thing works.
> In light of the above it might not be that useful for someone wanting to
> move a bot, but in general it's still useful.
>

In general you shouldn't get a brand new project for running a bot. It
should run in the bots project, like the other currently running bots.

> Step one is to get a Labs/Gerrit account via
> https://www.mediawiki.org/wiki/Developer_access .
>

Yes, this is for sure a requirement :).

> Step two is to put together a proposal (i.e. a paragraph or two saying
> what it is you plan to do) and do one of the following:
> 1. Go on IRC (I guess #wikimedia-labs or #wikimedia-dev or #mediawiki),
>    find one of the Ops people, and pitch your proposal.

Not necessary. If it's a bot that's approved to run on one of the
sites, and the code is open source, it should run in the bots project.
You should talk with the bots project community members:

https://labsconsole.wikimedia.org/wiki/Nova_Resource:Bots

> 2. Go on the labs-l mailing list and make your proposal.

Asking about how to get your bot added via the mailing list is a good
way to go about things.

> 3. File it as an "enhancement request" bug in Bugzilla.

Also a viable option. Bugs are slightly more cumbersome, but it at
least provides some means of tracking :).

> 4. Talk to an Ops person in some other way, e.g. on their talk page or
>    via email.
> (You should probably try to get Ops as a whole to decide which of the
> above they'd really prefer, or if they'd prefer using [[mw:Developer
> access]] or something like it instead.) If they decide they like your
> project, which Daniel says the default is currently "yes" unless they
> have a reason to say "no", they'll set it up with you as an admin over
> that project.
>

Ops isn't really making a yes/no decision on bots, unless a bot is
going to use up incredible amounts of resources (especially storage).
The bots community members will likely be more help than Ops.

> Step three then is to go log into labsconsole, and you can create a new
> "instance" (virtual server) within your project. I'm told there are
> various generic configurations to choose from, that vary mainly in how
> much virtual RAM and disk space they have. Then you can log into your
> instance and configure it however you need. I guess it will be running
> Ubuntu, based on what I was told about how the Puppet configuration
> works.
>

Bots are sharing instances, for the most part. Not all bots eat enough
resources to really need its own instance. The end goal is to have a
bots cluster, where we have some control interface for moving bots
between instances (including new ones, so that we can fix resource
issues).

The bots community members decide when a new instance should be
created, and where bots should be placed.

> At this level, it is either possible or on the road map to get various
> helpful services:
> * Public IP addresses. [possible right now]

Bots generally don't need public IPs.

> * Ability to have a Git repository automagically cloned, turned into a
>   deb package, and installed in a private Wikimedia-sepecific Ubuntu
>   package repository (which can then be installed from). [possible now]

Yes. This is possible, and it would be *awesome* to have the bots
turned into debian packages. We'll need to work out how repos will
work for bots. It's not an amazingly easy problem. Debian packages may
not be necessary, if we have a sane deployment system, though.

> * A UDP feed of recent changes, similar to the feed available over IRC
>   from [[meta:IRC/Channels#Recent changes]] but without having to mess
>   with IRC. In fact, it's the same data that is used to generate those
>   IRC feeds. [roadmap?]

This is likely not terribly easy. Can you add a proposal for this?

> * Access to a replicated database something like they have on the
>   Toolserver (i.e. no access to private data). [roadmap]

On the roadmap. This is currently scheduled for January, but the
schedule might be accelerated.

> * Other stuff? I get the impression Ops is open to ideas to make Labs
>   more useful to people.
>

We have lots of ideas. They are in the goals and on the roadmap. Right
now we're in a stability cycle, though.

> BTW, I mentioned Labs to TParis and the rest of the guys over here, and
> they say that a replicated copy of the database like the Toolserver has
> is absolutely essential before most people could move their stuff from
> the Toolserver to Labs.
>
> As far as resources go,
> * Each Labs project can request one or more public IP addresses to
>   assign to their instances. More than one public IP needs a good
>   reason; maybe that will change when IPv6 is fully supported. Instances
>   cannot share IP addresses. But all instances (from all projects) are
>   on a VLAN, so it should be possible to set up one instance as a
>   gateway to NAT the rest of the instances. No one has bothered to
>   actually do this yet. Note that accessing Wikipedia requires a public
>   IP; there is no "back door".

In general bots won't get public IPs at all. Access to the bots
infrastructure is available via the bastion host.

> * Each Labs project has a limit on the number of open files (or
>   something like that) at any one time. No one has ever had to change
>   the current default limit on this, yet.

This is only true to the limit of the OS.

> * There is something of a limit on disk I/O, e.g., if someone has a lot
>   of instances all accessing terabyte-sized files all the time, that
>   would blow things up. This might be a global limitation on all
>   projects.

>From a resources POV, one instance/project can affect all projects if
they hit the resources very hard. There's no explicit limit.

> * I guess most other resources are limited on a per-instance basis. For
>   example, disk space available and RAM availble.
>

Yep. Per instance.

> You *could* stop here, although Ops would really like you to continue on
> to step 4.
>
> Step four, once you've gotten your instance configuration figured out
> and working, is to write a Puppet configuration that basically instructs
> Puppet how to duplicate your setup from scratch. (How to do this will
> probably turn out to be a whole howto on its own, once someone who has
> time to write it up figures out the details.) At a high level, it
> consists of stuff like which Ubuntu packages to install, which extra
> files (also stored in the Puppet config in Git) to copy in, and so on.
>

This is ideal, but not required for every user. I'd really prefer to
have a community of people who can/will do puppetization and packaging
of bots.

> This puppet configuration needs to get put into Git/Gerrit somewhere.
> And then Ops or someone else with the ability to "+2" will have to
> approve it, and then you can use it just like the generic configs back
> in Step 3 to create more instances. If you need to change the
> configuration, same deal with pushing the new version to Gerrit and
> having someone "+2" it.
>

Depending on how we do this, there's no need for ops to be in the
chain at least from the perspective of debian packages or bot code.
Puppetization will always require a +2 from ops, though.

> The major advantage of getting things Puppetized is that you can clone
> your instance at the click of a button, and if the virtual-server
> equivalent of a hard drive crash were to occur you could be up and
> running on a new instance pretty much instantly. And handing
> administration of the project on to someone else would be easier,
> because everything would be documented instead of being a random
> mish-mash that no one really knows how it is set up. It's also a
> requirement for going on to production.
>

Yes. Puppetization is definitely the way to go. It makes everything
way easier. As mentioned earlier, though, we have no plans on moving
bots to production.

> This is not necessarily an all-or-nothing thing. For example, the list
> of packages installed and configuration file settings could be
> puppetized while you still log in to update your bot code manually
> instead of having that managed through Puppet. But I guess it would need
> to be fully completed before you could move to production.
>

Yep. We don't really have any great process for this defined right
now. It would be great to have a deployment system for bots. I'm
working on a new deployment system right now. It may be possible to
use it for this.

> Step five, if you want, is to move from labs to production. The major
> advantages are the possibility to get even more resources (maybe even a
> dedicated server rather than a virtual one) and the possibility to get
> access to private data from the MediaWiki databases. I guess at this
> point all configuration changes whould have to go through updating the
> Puppet configuration via Gerrit.
>
> I don't know if you were listening when we were discussing downtime, but
> it doesn't seem to be that much of a concern on Labs as long as you're
> not running something that will bring all of Wikipedia to a halt if it
> goes down for an hour or two. Compare this to how the Toolserver was
> effectively down for almost three weeks at the end of March, and
> everyone survived somehow (see [[Wikipedia:Village pump
> (technical)/Archive 98#Toolserver replication lag]]).
>

Well, bots are considered "semi-production". We hope very much to not
have an extended downtime, but the infrastructure isn't funded for the
level of redundancy and availability in production. There's definitely
a possibility of occasional full day downtimes, occasionally,
depending on what kind of failure occurs. In some cases (like a
compute node loss) it's possible that specific instances could be
inaccessible for a few days.

Of course one of the biggest reasons for puppetization is avoiding
situations like this.

- Ryan