Ryan Lane wrote:
> If WMF becomes evil, fork the entire infrastructure into EC2,
> Rackspace cloud, HP cloud, etc. and bring the community operations
> people along for the ride. Hell, use the replicated databases in Labs
> to populate your database in the cloud.
Tim Landscheidt wrote:
> But the nice thing about Labs is that you can try out (re-
> plicable :-)) replication setups at no cost, and don't have
> to upfront investments on hardware, etc., so when time
> comes, you can just upload your setup to EC2 or whatever and
> have a working Wikipedia clone running in a manageable time-
This is not an easy task. Replicating the databases is
challenging (they're huge datasets in the cases of the big wikis) and
they're constantly changing. If you tried to rely on dumps alone, you'd
always be out of date by at least two weeks (assuming dumps are working
properly). Two weeks on the Internet is a lot of time.
I don't know if this is not an easy task, but you are proba-
bly right. So what? If a scenario of WMF turning rogue
couldn't bear losing two weeks of edits while saving almost
a decade, we should work on ways to incremental dumps.
But more to the point, even if you suddenly had a lot
(bandwidth for constantly retrieving the data, space to store it all, and
extra memory and CPU to allow users to, y'know, do something with it) and
even if you suddenly had staff capable of managing these databases, not
every table is in even available currently. As far as I'm aware,
doesn't include tables such as "user",
"ipblocks", "archive", "watchlist", any tables related to
global images or
global user accounts, and probably many others. I'm not sure a full audit
has ever been done, but this is partially tracked by
The first part is easy: You go to some supplier and buy
bandwith, space, memory and CPU. There is even staff for
The second part is simple as well: What do you need
"ipblocks" or "watchlist" in a Wikipedia clone for? It cer-
tainly is neither free content nor the content users use Wi-
So beyond the silly simplicity of the suggestion that
one could simply "move
to the cloud!", there are currently technical impossibilities to doing so.
And it would be far more helpful if you could stop spreading
FUD and instead show what actual impediments there are, for
example in a Labs project.