Jochen Magnus wrote:
Thank you Brian,
I found the directory (but no link to it?), but I have some questions:
- As far as I saw, all articles in the XML dump are from namespace 0. Am
I right?
No, all namespaces are included.
At some point we will also produce limited dumps which include only main, image, and template namespace (and perhaps project namespace, since essential things like the license info tend to be there).
- Is there a documentation about the format? I don't know the exact
meaning of <revision> and I don't know, which <id> is the database id (primary key): the "page->id" or the "page->revision->id"?
http://meta.wikimedia.org/wiki/Help:Export has some notes.
- will XML dumps made recurrent (steady) in the future?
Yes.
-- brion vibber (brion @ pobox.com)