On Sep 25, 2012, at 11:12 PM, "DaB." <WP(a)daniel.baur4.info> wrote:
Hello Erik,
At Tuesday 25 September 2012 22:24:33 DaB. wrote:
The initial focus for Labs has been to provide functionality that
toolserver doesn't - get root on a VM or set of VMs to install/test
arbitrary software/services, and get it ready for production
deployment.
It is nice to have root on a (virtual) machine, but I doubt most tools need it
[..]
Indeed, however the reason this is crucial for labs is because its scope is much
wider than Toolserver.
For example, in the "deployment" project we simulate nearly the entire WMF
production cluster (including db hosts, apaches, squids, varnish, scalers,
etc.).
This makes one of the very different goals of Labs possible, namely to allow
volunteers to contribute to operations (as opposed to the software we run).
Once everything is puppetized one can basically create a new labs project,
use "wmf-production" as template and instantiate a complete wmf cluster (not
with all the database contents, just the server setup, though it'd contain
sufficient sample data, the purpose is to simulate the servers to develop new
configurations, not use as web site). Give it a subdomain and you'd immediately
have stuff like
commons.wikimedia.myproject.wmflabs.org.
Back to the subject, does that mean users will have to learn to manage a VM and
require a public IP and subdomain? No, not at all. We're confusing Dev Labs with
Tool Labs (perhaps we shouldn't name them like that as isolated projects).
Implementation of Tool Labs isn't decided on afaik, but I believe it will naturally
solve itself by being distributed among various projects. Behind the scenes they
will likely be a regular labs project, but abstracted for users (e.g. not an
instance-group or even an instance per tool, but all in one instance-group, with
a group of servers for different purposes, like Toolserver has web servers, sql
servers, login/application servers).
E.g. the tools project in wmflabs would have various web servers and application
servers[1]. Users wanting to run queries, bots and long-running/periodic processes
would use the application servers. Ideally we'd encourage use of SGE (or
something alike) from the beginning so that the application servers are
optimally used, and it would make it easy to start a process in the background
of an application server from a process on the web server
Access to the wmf wiki replicated dbs is public across the entire wmflabs
network so that's a given within the toollabs project as well.
-- Krinkle
[1] The "bots" project exists already. It doesn't have SGE yet but it's
a first
step. There is also a generic "webtools" project being set up as we speak.
Perhaps these two could be merged so that users have shared project storage for
bots generating data to be used by bots and vice-versa.