2010/11/5 Platonides <platonides@gmail.com>
xiang wang wrote:
> I have download the "Database backup dumps" of chinese Edition.
> There are files with XML and sql format. I want to have data all in
> database like MySQL. Can I get this data (especially the XML format) to
> MySQL database without using MediaWiki? How to do this if possible?

I would use mwdumper to do that
Other options are listed in

> Where can I get the format details of each dump? Because I have read
> contents in "zhwiki-20101014-pages-articles.xml" , but chiness have two
> eddition: "Simplified Chinese" and “Traditional Chinese”. Both format
> exits raffertily In file "zhwiki-20101014-pages-articles.xml" . I don't
> known how to get rid it.
> Thanks!

That file is in the same format as the wiki pages. The two variants come
from the same text (which is what you get in the dump), automatically
converted into one or other (with some especifics with text inside -{}- ).
That content in a mediawiki install sohuld be able to replicate zhwiki

Thanks for your answers! It's very helpful! I used MWDumper, but i get an error:

Exception in thread "main" java.lang.NullPointerException
        at org.mediawiki.importer.XmlDumpReader.readTitle(XmlDumpReader.java:31
        at org.mediawiki.importer.XmlDumpReader.endElement(XmlDumpReader.java:2
        at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Sourc
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement
Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentConten
Dispatcher.dispatch(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(U
known Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Sou
        at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:195)
        at org.mediawiki.importer.XmlDumpReader.readDump(XmlDumpReader.java:88)
        at org.mediawiki.dumper.Dumper.main(Dumper.java:143)"

Do you know what's the problem?