Timwi wrote:
There is MySQL replication. I don't know much about it though (nor do I know much about rsync other than what it's used for), so I don't know.
As far as I know, MySQL replication isn't practical for the one-to-many scenario that a lot of people would like to see it used for (home users keeping syncs of the wikipedia database). I've been writing some basic code that would let us do this type of 1-to-x replication nicely, but we'll see whether anything comes of that.
I suppose I could argue that you could couple this with CVS, which can produce patches to the dumps, but that's getting hacky :-)
It would hardly be a valid argument. My Feb 15 dump of 'cur' and Jan 30 dump of 'old' are roughly 15.1Gb total in size, uncompressed. I'm not even sure how diff would behave with files that size; if you compress them, you'd be looking into binary patches that would in all likelihood be useless (i.e. someone correct me if I'm wrong, but depending on your blocksize, a change early in the file would probably propagate downstream and require retransfer of most of the file).
As far as queries are concerned, you can see it yourself on the Special Pages - most of the queries are turned off for performance reasons, so that's not really a good argument.
Ah, but that's not an indication that MySQL, or RDBMS in general, is a bad idea. I just shows that the current database schema is ill-designed.
Possibly. In my mind, a database is certainly the way to go; the question becomes whether a less general implementation, tailored to our needed subset of SQL92 (as opposed to something implementing the entire thing and more) is needed. I have no data to speak to concretely, but I'd suspect that such a drop in complexity could likely produce strong performance improvements.
I'm not sure I fully understand what you mean by this. Apparently we do update something like the "recentchanges" table on every edit, because some people seem to have thought that it would make for better performance, but I'm pretty much convinced that this assumption is fallacious.
I lay no claims to expertise in this area, but I'm (and I'm sure others are) more than willing to hear better proposals. How would you do it?
Cheers, Ivan