2010/11/5 Platonides <span dir="ltr"><<a href="mailto:platonides@gmail.com">platonides@gmail.com</a>></span><br>
<div class="gmail_quote"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im">xiang wang wrote:<br>
> I have download the "Database backup dumps" of chinese Edition.<br>
> There are files with XML and sql format. I want to have data all in<br>
> database like MySQL. Can I get this data (especially the XML format) to<br>
> MySQL database without using MediaWiki? How to do this if possible?<br>
<br>
</div>I would use mwdumper to do that<br>
<a href="http://www.mediawiki.org/wiki/Manual:MWDumper" target="_blank">http://www.mediawiki.org/wiki/Manual:MWDumper</a><br>
Other options are listed in<br>
<a href="http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps" target="_blank">http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps</a> <br>
</blockquote><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<div class="im"><br>
<br>
> Where can I get the format details of each dump? Because I have read<br>
> contents in "zhwiki-20101014-pages-articles.xml" , but chiness have two<br>
> eddition: "Simplified Chinese" and “Traditional Chinese”. Both format<br>
> exits raffertily In file "zhwiki-20101014-pages-articles.xml" . I don't<br>
> known how to get rid it.<br>
><br>
> Thanks!<br>
<br>
</div>That file is in the same format as the wiki pages. The two variants come<br>
from the same text (which is what you get in the dump), automatically<br>
converted into one or other (with some especifics with text inside -{}- ).<br>
That content in a mediawiki install sohuld be able to replicate zhwiki<br>
pages.<br></blockquote><div><br>
Thanks for your answers! It's very helpful! I used MWDumper, but i get an error:<br>
<br>
Exception in thread "main" java.lang.NullPointerException<br>
at org.mediawiki.importer.XmlDumpReader.readTitle(XmlDumpReader.java:31<br>
)<br>
at org.mediawiki.importer.XmlDumpReader.endElement(XmlDumpReader.java:2<br>
3)<br>
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Sourc<br>
)<br>
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement<br>
Unknown Source)<br>
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentConten<br>
Dispatcher.dispatch(Unknown Source)<br>
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(U<br>
known Source)<br>
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)<br>
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)<br>
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)<br>
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)<br>
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Sou<br>
ce)<br>
at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)<br>
at javax.xml.parsers.SAXParser.parse(SAXParser.java:195)<br>
at org.mediawiki.importer.XmlDumpReader.readDump(XmlDumpReader.java:88)<br>
at org.mediawiki.dumper.Dumper.main(Dumper.java:143)"<br>
<br>
Do you know what's the problem? <br>
Thanks!<br>
</div></div></div><br>