Hi everyone,
I am trying to restore the revision table from Wikipedia dumps. I understand that the file that I need is probably enwiki-XX-pages- logging.xml.gz
I've downloaded the file and I am using the 1.16 version of mwdumper from https://integration.wikimedia.org/ci/job/MWDumper-package/org.wikimedia$mwdu...
When I execute the following I get this error: java -server -jar mwdumper.jar --format=sql:1.5 enwiki-20130503-pages-logging.xml.gz | gzip -vc > enwiki-latest-pages-articles.sql.gz Exception in thread "main" java.lang.IllegalArgumentException: Unexpected <id> outside a <page>, <revision>, or <contributor> at org.mediawiki.importer.XmlDumpReader.readId(XmlDumpReader.java:329) at org.mediawiki.importer.XmlDumpReader.endElement(XmlDumpReader.java:204) at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) at javax.xml.parsers.SAXParser.parse(SAXParser.java:392) at javax.xml.parsers.SAXParser.parse(SAXParser.java:195) at org.mediawiki.importer.XmlDumpReader.readDump(XmlDumpReader.java:88) at org.mediawiki.dumper.Dumper.main(Dumper.java:142) 0.0%
Mwdumper works well with other 7z xml files but not for this one. I tried a couple of different xml page-logging files and even from a different language wikipedias.
Anyone knows what this error is and why it occurs on this specific file?
PS: I've also tried to build mwdumper: git clone https://gerrit.wikimedia.org/r/p/mediawiki/tools/mwdumper.git mwdumper
However I couldn't use make or ant since there was not build.xml or makefile in the git.
I appreciate any help you can give me with this.