Hi all,
Unfortunately the xml2sql.exe makes extended inserts (by n rows) when
producing MySQL's INSERT format. Problem is that when there's HUGE pages
in the xml dump (like some pages in Wikipedia) then one have to reduce
the number of rows per "batch" when uploading with BigDump.php,
otherwise the upload will surely crash.
But reducing the amount of rows per batch makes upload very slow...
If the produced output would (optionally) be NOT extended, I could more
easily* fix the INSERTS on my own (reshuffling the sql by total *size* -
instead of "extending" the inserts by a certain number of rows).
Suggested solution:
#1: Option: --noextended
or (example, make extended inserts by size (kb) instead:
#2: Option: --extendedbykb 512
But the developer Toetew
(
http://meta.wikimedia.org/wiki/User_talk:Tietew#Hyper_Estraider_extension)
doesn't seem to be very active. Can the source code be modified, and if
so, who's expert on ANSI C...? Or perhaps someone knows how to get in
contact with Tietew?
Regards,
// Rolf Lampa
[*] - more easy: In order to "tear apart" the extended INSERTS into
separate rows again (in order to pack them again later, but by total
size instead) I need to use three different Regex for finding where to
split the INSERT for page.sql, revision.sql and text.sql respectively.
But sometimes the text-table gets messed up anyway (it tends to do so
when trying to split articles containg like sql code examples and the
alike... ).