During a massive dump import proccess (with mwdumper), which involves the top 10 wikipedias, I've found some 'odd' errors like, for example, this one recovering itwiki latest dump (2006-10-21, reports as OK): ..... 478,834 pages (154.138/sec), 4,603,000 revs (1,481.718/sec) ERROR 1062 at line 2955: Duplicate entry '103-Matematica' for key 2 479,091 pages (153.95/sec), 4,604,000 revs (1,479.44/sec) ..... . From that point on, the recovery process fails and no more data gets into MySQL database..
I've made a little check out method to test if the total number of pages and revisions fits with the online hints in the web page, but sometimes I don't know how to manage this errors in other way than downloading the next dump backwards, till one of them works fine (because all dumps report to be OK).
If you're not very careful you'd think that the import process did its job, because an aprox. 85% of pages was in the DB... Any way to solve this non-config errors?
BTW, I don't chase Brion at all.... He does his best with the dump process and mwdumper.
Thanks, all the best.
Felipe.
---------------------------------
LLama Gratis a cualquier PC del Mundo. Llamadas a fijos y móviles desde 1 céntimo por minuto. http://es.voice.yahoo.com
I apologise for following myself....
I've seen some previous messages in the list pointing out the same problem. Some of them used import.php, which is not my case.
Did anyone noticed these problems of duplicate key entries with mwdumper?
Of course, I've correctly initialized de database tables with the last tables.sql.
We use MySQL 4.0 and no InnoDB tables.
All the best.
Felipe.
Felipe Ortega glimmer_phoenix@yahoo.es escribió: During a massive dump import proccess (with mwdumper), which involves the top 10 wikipedias, I've found some 'odd' errors like, for example, this one recovering itwiki latest dump (2006-10-21, reports as OK): ..... 478,834 pages (154.138/sec), 4,603,000 revs (1,481.718/sec) ERROR 1062 at line 2955: Duplicate entry '103-Matematica' for key 2 479,091 pages (153.95/sec), 4,604,000 revs (1,479.44/sec) ..... . From that point on, the recovery process fails and no more data gets into MySQL database..
I've made a little check out method to test if the total number of pages and revisions fits with the online hints in the web page, but sometimes I don't know how to manage this errors in other way than downloading the next dump backwards, till one of them works fine (because all dumps report to be OK).
If you're not very careful you'd think that the import process did its job, because an aprox. 85% of pages was in the DB... Any way to solve this non-config errors?
BTW, I don't chase Brion at all.... He does his best with the dump process and mwdumper.
Thanks, all the best.
Felipe.
---------------------------------
LLama Gratis a cualquier PC del Mundo. Llamadas a fijos y móviles desde 1 céntimo por minuto. http://es.voice.yahoo.com _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
---------------------------------
LLama Gratis a cualquier PC del Mundo. Llamadas a fijos y móviles desde 1 céntimo por minuto. http://es.voice.yahoo.com
wikitech-l@lists.wikimedia.org