Ran into a weird dump problem yesterday, which has me wondering if there's a problem with my artificially created xml for upload. Here's what happens. I have a script that builds wiki pages from an external source and embeds them in xml for upload via importDump. The script can be toggled to either generate a single page or a bunch of them. The same script failed to load some pages that load just fine if you specify them individually, but ImportDump.php does NOT crash during the import.
I suspect that there is something wrong with the upstream items, but I can't find it. The Brown Univ XML validator complains about the following:
line 3, ecoliwiki20070730123135.xml: error (1102): tag uses GI for an undeclared element: mediawiki line 166616, ecoliwiki20070730123135.xml: error (1012): reference to undeclared entity: line 166616, ecoliwiki20070730123135.xml: error (1003): entity (or its expansion) is invalid: line 166616, ecoliwiki20070730123135.xml: error (1012): reference to undeclared entity: line 166616, ecoliwiki20070730123135.xml: error (1003): entity (or its expansion) is invalid: line 184234, ecoliwiki20070730123135.xml: error (402): EOF encountered; no doctype declaration found: mediawiki
but I'm pretty sure these are all red herrings. So...is there a validator out there I should be using? Is there another reason why a record might be skipped? I know that if the timestamps are earlier than the last version there's a problem, but these were all loading into empty Category pages.
Jim ===================================== Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054