[Labs-l] TS and/vs. Labs

Tim Landscheidt tim at tim-landscheidt.de
Fri Sep 20 02:11:00 UTC 2013


"Marc A. Pelletier" <marc at uberbox.org> wrote:

>> labs db server do contain the non-public data.

> Just to clarify things, here, that's a necessary artifact of
> the way mediawiki works: many of the bits of data that
> should be made unavailable are done so conditionally on
> /live data/.  It is not possible to replicate the visible
> contents of the views and have it not break as column values
> may be nulled or *come back* depending on actions on the
> projects (supression, deletion, etc).

> While some of the more sensitive data never hits the replica
> (like IP addresses), It is not /possible/ to create a
> replica without transferring data that can be unsupressed or
> undeleted -- but might never be.

You're right about the status quo, and I feel a bit senile
because I remember you telling me about it some time
ago :-).

(anonymous) wrote:

> [...]

> I guess there could be a labs replica with triggers that
> deleted data at the point it became private, which could
> then serve as replication master providing public
> binlogs. The problem which breaks the idea completely are
> the restores, ie. non-public data coming back (the server
> would receive ‘show this again’, but it would need a full
> insert...).

And the nice thing about Labs is that anyone who wants to
test this can set up a project here consisting of an in-
stance with a wiki, an instance with a "filtering replica-
tor" and an instance with a "receiver" and puppetize the so-
lution so that it can be deployed in production :-).

One would probably want to share the view definitions of de-
pendencies with the Labs setup and create the triggers auto-
matically from that, so that on initial deployment and on
schema changes one wouldn't have to show that the triggers
really delete all non-public/restore only public data, but
just that the process schema -> trigger works.

As the development may take more time than a rainy Sunday
has to offer, I've posted the idea at
https://meta.wikimedia.org/wiki/Grants:IdeaLab/Provide_public_database_replication
for brainstorming.  The IEG condition:

| Any technical components must be standalone or completed on-
| wiki.

is a bit hazy, but if the idea isn't eligible under that
program, there would probably be other avenues for funding.

Tim




More information about the Labs-l mailing list