Magnus Manske wrote:
Jimmy Wales wrote:
No one has said anything negative about what I'm proposing yet (that I've seen) but I should make clear that I'm not at all talking about making our free content costly for proprietary applications. It is only the hammering of our (expensive) servers that I'm worried about.
If Microsoft or Apple wants to mirror Wikipedia and hit their own servers, that's fine.
On a technical note, Brion announced that the XML produced by Special:Export will be out new "dump" format, together with an import script. Coincidentially, I recently wrote an extension that can list the titles of articles changed since date/time X. Working together, these two parts could enable mirrors to keep their databases up-to-date with only a few hours/minutes delay to the "live" server, and without having to transfer the whole database once every two weeks (which was the release cycle, right?).
Magnus, we've had this for a few months actually. See extensions/OAI for the client/server database updater system, which we're currently using somewhat experimentally for a couple clients. Wikimedia plans to offer this as a value-add service to commercial mirrors generally (and I hope for non-profit mirrors, though you'd have to talk to Jimbo to know what's up).
This uses the OAI-PMH[1] wrapper protocol to pull page updates, formatted using the Special:Export XML schema.
(The client portion currently works only on a local 1.4 installation though it's independent of the server version. I'll be tuning it up for 1.5 when I have a chance in the next few weeks.)
[1] http://www.openarchives.org/OAI/openarchivesprotocol.html
-- brion vibber (brion @ pobox.com)