David Campeau wrote:
How does one go to know what namespace is an article in the new dump scheme. xml looks something like this:
[snip]
are we supposed to load the *ns0 file to know what article are in ns0? what about the other namespace?
The list of namespaces is right there in the dump:
<mediawiki xmlns="http://www.mediawiki.org/xml/export-0.3/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.3/ http://www.mediawiki.org/xml/export-0.3.xsd" version="0.3" xml:lang="aa"> <siteinfo> <sitename>Wikipedia</sitename> <base>http://aa.wikipedia.org/wiki/Main_Page</base> <generator>MediaWiki 1.5beta3</generator> <case>first-letter</case> <namespaces> <namespace key="-2">Media</namespace> <namespace key="-1">Special</namespace> <namespace key="0" /> <namespace key="1">Talk</namespace> <namespace key="2">User</namespace> <namespace key="3">User talk</namespace> <namespace key="4">Wikipedia</namespace> <namespace key="5">Wikipedia talk</namespace> <namespace key="6">Image</namespace> <namespace key="7">Image talk</namespace> <namespace key="8">MediaWiki</namespace> <namespace key="9">MediaWiki talk</namespace> <namespace key="10">Template</namespace> <namespace key="11">Template talk</namespace> <namespace key="12">Help</namespace> <namespace key="13">Help talk</namespace> <namespace key="14">Category</namespace> <namespace key="15">Category talk</namespace> </namespaces> </siteinfo> <page> <title>MediaWiki:1movedto2</title> <id>1</id> ...
-- brion vibber (brion @ pobox.com)