Thank you Brian,
I found the directory (but no link to it?), but I have some questions:
- As far as I saw, all articles in the XML dump are from namespace 0. Am I right?
- Is there a documentation about the format? I don't know the exact meaning of <revision> and I don't know, which <id> is the database id (primary key): the "page->id" or the "page->revision->id"?
- will XML dumps made recurrent (steady) in the future?
- and, please understand this question ironically, are there no other users of dumps except me, because I'm the only one who nerves you about this topic ;-)
Yours
jo
Brion Vibber wrote:
There's a 20050713 dump in the new format (check the directories). Another dump will run this weekend.
Jochen Magnus wrote:
Thank you Brian,
I found the directory (but no link to it?), but I have some questions:
- As far as I saw, all articles in the XML dump are from namespace 0. Am
I right?
No, all namespaces are included.
At some point we will also produce limited dumps which include only main, image, and template namespace (and perhaps project namespace, since essential things like the license info tend to be there).
- Is there a documentation about the format? I don't know the exact
meaning of <revision> and I don't know, which <id> is the database id (primary key): the "page->id" or the "page->revision->id"?
http://meta.wikimedia.org/wiki/Help:Export has some notes.
- will XML dumps made recurrent (steady) in the future?
Yes.
-- brion vibber (brion @ pobox.com)
wikitech-l@lists.wikimedia.org