[Mediawiki-l] Is there a dump XML validator?

Jim Hu jimhu at tamu.edu
Mon Jul 30 20:21:52 UTC 2007


Ran into a weird dump problem yesterday, which has me wondering if  
there's a problem with my artificially created xml for upload.   
Here's what happens.  I have a script that builds wiki pages from an  
external source and embeds them in xml for upload via importDump.   
The script can be toggled to either generate a single page or a bunch  
of them.  The same script failed to load some pages that load just  
fine if you specify them individually, but ImportDump.php does NOT  
crash during the import.

I suspect that there is something wrong with the upstream items, but  
I can't find it.   The Brown Univ XML validator complains about the  
following:

line 3, ecoliwiki20070730123135.xml:
error (1102): tag uses GI for an undeclared element: mediawiki
line 166616, ecoliwiki20070730123135.xml:
error (1012): reference to undeclared entity:  
line 166616, ecoliwiki20070730123135.xml:
error (1003): entity (or its expansion) is invalid:  
line 166616, ecoliwiki20070730123135.xml:
error (1012): reference to undeclared entity:  
line 166616, ecoliwiki20070730123135.xml:
error (1003): entity (or its expansion) is invalid:  
line 184234, ecoliwiki20070730123135.xml:
error (402): EOF encountered; no doctype declaration found: mediawiki

but I'm pretty sure these are all red herrings.  So...is there a  
validator out there I should be using?  Is there another reason why a  
record might be skipped?  I know that if the timestamps are earlier  
than the last version there's a problem, but these were all loading  
into empty Category pages.

Jim
=====================================
Jim Hu
Associate Professor
Dept. of Biochemistry and Biophysics
2128 TAMU
Texas A&M Univ.
College Station, TX 77843-2128
979-862-4054




More information about the MediaWiki-l mailing list