De: A B m311man@gmail.com Para: xmldatadumps-l@lists.wikimedia.org Enviado: Lunes 22 de julio de 2013 21:44 Asunto: [Xmldatadumps-l] Import logging
Hi guys,
I'm trying to import the enwiki-pages-logging.xml into a MySQL database and I'm having a lot of troubles converting de XML into a SQL statements. I'm using importDump.php to do this conversion, but I'm getting an error when the script tries to import a register with the next data: <logtitle>�0”1r¨©m¨¡l¨¡dev¨©-si�5õ1ha-n¨¡da-s¨±tra</logtitle>
It seems an encoding problem, but I think I have everything correct. Does anybody have a suggestion?
Hello, Mel.
I'm not sure what is the problem with importDump.php, but it does seem like an encoding issue. Can you provide the full error dump message? A first sanity check I can think of is to verify the character-set of your MySQL server. I always use (in the MySQL configuration file):
character_set_server = 'utf8'
As an alternative, I have been consistently importing pages-logging.xml dumps for different Wikipedia languages with the script "pages_logging.py" in my WikiDAT tool:
https://github.com/glimmerphoenix/WikiDAT/tree/master/wikidat/sources
The only (additional) dependencies is to have Python and the Python MySQLdb module installed (you have not say which is your operating system). First, you need to have a MySQL database already created, as well as the logging table according to this schema:
https://github.com/glimmerphoenix/WikiDAT/blob/master/wikidat/sources/db/tab... (logging table is at the end of the file).
Usage is (from command line):
$ python pages_logging.py db_name db_user db_passw dump_file log_file
Example:
$ python pages_logging.py enwiki foouser foopassw enwiki-pages-logging.xml.gz enwiki-pages-logging.log
It may create a bit more info than you strictly need, but you should be able to import the dump file without issues. If you find any problems, just let me know and I can try to help solve them.
Best, Felipe.
Thanks in advance.
Mel _______________________________________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l