(anonymous) wrote:
[...] Ryan Lane wrote:
If WMF becomes evil, fork the entire infrastructure into EC2, Rackspace cloud, HP cloud, etc. and bring the community operations people along for the ride. Hell, use the replicated databases in Labs to populate your database in the cloud.
Tim Landscheidt wrote:
But the nice thing about Labs is that you can try out (re- plicable :-)) replication setups at no cost, and don't have to upfront investments on hardware, etc., so when time comes, you can just upload your setup to EC2 or whatever and have a working Wikipedia clone running in a manageable time- frame.
This is not an easy task. Replicating the databases is enormously challenging (they're huge datasets in the cases of the big wikis) and they're constantly changing. If you tried to rely on dumps alone, you'd always be out of date by at least two weeks (assuming dumps are working properly). Two weeks on the Internet is a lot of time.
I don't know if this is not an easy task, but you are proba- bly right. So what? If a scenario of WMF turning rogue couldn't bear losing two weeks of edits while saving almost a decade, we should work on ways to incremental dumps.
But more to the point, even if you suddenly had a lot of infrastructure (bandwidth for constantly retrieving the data, space to store it all, and extra memory and CPU to allow users to, y'know, do something with it) and even if you suddenly had staff capable of managing these databases, not every table is in even available currently. As far as I'm aware, http://dumps.wikimedia.org doesn't include tables such as "user", "ipblocks", "archive", "watchlist", any tables related to global images or global user accounts, and probably many others. I'm not sure a full audit has ever been done, but this is partially tracked by https://bugzilla.wikimedia.org/show_bug.cgi?id=25602.
The first part is easy: You go to some supplier and buy bandwith, space, memory and CPU. There is even staff for hire.
The second part is simple as well: What do you need "ipblocks" or "watchlist" in a Wikipedia clone for? It cer- tainly is neither free content nor the content users use Wi- kipedia for.
So beyond the silly simplicity of the suggestion that one could simply "move to the cloud!", there are currently technical impossibilities to doing so.
And it would be far more helpful if you could stop spreading FUD and instead show what actual impediments there are, for example in a Labs project.
Tim