During a massive dump import proccess (with mwdumper), which involves the top 10 wikipedias, I've found some 'odd' errors like, for example, this one recovering itwiki latest dump (2006-10-21, reports as OK): ..... 478,834 pages (154.138/sec), 4,603,000 revs (1,481.718/sec) ERROR 1062 at line 2955: Duplicate entry '103-Matematica' for key 2 479,091 pages (153.95/sec), 4,604,000 revs (1,479.44/sec) ..... . From that point on, the recovery process fails and no more data gets into MySQL database..
I've made a little check out method to test if the total number of pages and revisions fits with the online hints in the web page, but sometimes I don't know how to manage this errors in other way than downloading the next dump backwards, till one of them works fine (because all dumps report to be OK).
If you're not very careful you'd think that the import process did its job, because an aprox. 85% of pages was in the DB... Any way to solve this non-config errors?
BTW, I don't chase Brion at all.... He does his best with the dump process and mwdumper.
Thanks, all the best.
Felipe.
---------------------------------
LLama Gratis a cualquier PC del Mundo. Llamadas a fijos y móviles desde 1 céntimo por minuto. http://es.voice.yahoo.com