[Labs-l] Accessing the databases from labs - A comparison with the toolserver

Marc A. Pelletier marc at uberbox.org
Fri Jul 12 15:43:57 UTC 2013


On 07/12/2013 11:13 AM, Platonides wrote:
> - A toolserver table, available on all database servers.

The problem is that, AFAICT, there is currently no agreement of what
that table should contain and what its schema should be.  Viz bz 48626[1]

> - sql-sX-rr.labsdb and sql-sX-userdb.labsdb "dns" entries. I would need
> to detect in my tools if it should
> append .toolserver.org or .labsdb (is there a supported way of detecting
> that you are running on tool labs?)
> But that step seems reasonable. (BTW, What about adding labsdb to
> resolv.conf(5) search?)

I'm not sure where the difference is between deciding what to add to the
host name and what the host name should be lies.

Toolserver:  %s-p.rrdb.toolserver.org
Tool Labs:   %s.labsdb

Also, relying on a particular mapping between shards and databases is a
Bad Thing regardless; this way maintenance woes lies.  You shouldn't be
connecting to "shard N which happens to be where foowiki_p is" but "to
where foowiki_p is".  Not only can the mapping of database to cluster
change in production, but there is no reason why that mapping needs to
remain the same for the replicas.

In other words, on Labs, connecting to shards "by number" is an error
(and I don't expect to preserve the undocumented s?.labsdb names at all
once things are moved to DNS).

> - Database names compatible with those of the toolserver. References to
> the dbs are sometimes spread on
> the codebase, and migrating shouldn't require a hunt for them if it's
> avoidable.

In this particular case, it's not avoidable (for user databases).
AFAICT, the replicated databases names are the same.

> - dns names like project-p.labsdb for compatibility with TS tools?
> Perhaps *.(rr|user)db.toolserver.org
> should be aliased to .labsdb

> - Marking the global dbs in that toolserver table would also be nice.

Having database hostnames in /etc/hosts rather than in DNS is a
temporary hack that is, actually, scheduled to go away shortly (days).
Provided toolserver-like aliases is entirely possible, but I'm not
certain I see the point (because of the first section above).

> - How to detect if you are running in labs? (for dual tools)

Possibly the very simplest way to do this would be to provide (say)
/etc/labs on every tool labs host, testing for its presence should be a
reliable indication.  I'll add this shortly.

--

All of that said, I agree that having the same code run unchanged on
both the Toolserver and Tool Labs would require some adaptation but:

(a) this is, at any rate, unavoidable.  /All/ Tool Labs project need to
be multi-maintainer (with a different and simpler system) and run
through the grid engine both of which implies bigger changes than
database names to connect to; and

(b) the effort of having the same code run unchanged on both
"variations" of replicas does not seem to be a worthwhile investment of
time and effort for maintainers since the toolserver replicas are going
away for good on 2014-06-30 at the very latest (and possibly earlier).

-- Marc

[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=48626




More information about the Labs-l mailing list