Hello wikitechies,
I'm currently visiting Indiana University working on a Wikipedia visualization project. I've got two questions, I'd appreciate any advice you could give me.
The Info Visualization lab here has a policy against using mysql, so I need to convert the data to postgres. Has this been done before with wikipedia dumps? I have tried two utilities I found "out there", mysql2pgsql.perl and my2pg.pl, and neither seems to able to convert Wp dumps correctly. Since mediawiki 1.5 will be postgres-enabled, I hope someone has figured out a way to do this already. Am I right?
It'd be great to somehow overlay a map of Wikipedia articles onto the map of the world, but to do this we'd need to associate articles with geographic locations (one-to-one or one-to-many, no matter). One way to do this would be via edit histories, using IP addresses of anonymous users and generating a map of anonymous usage of Wp. However there seems to be no trivial way to do this for the entire dataset. If anyone has an idea we're more than happy to listen & share Wikipedia visualization fame with them :).
Thank you! Miran
Miran Bozicevic wrote:
The Info Visualization lab here has a policy against using mysql, so I need to convert the data to postgres. Has this been done before with wikipedia dumps?
Try the current XML-format dumps; you can then import them in whatever format you please. :)
I have tried two utilities I found "out there", mysql2pgsql.perl and my2pg.pl, and neither seems to able to convert Wp dumps correctly. Since mediawiki 1.5 will be postgres-enabled, I hope someone has figured out a way to do this already. Am I right?
MediaWiki 1.5 will not work with PostgreSQL, as no one has maintained the PostgreSQL support code. A future version might work again some day, should anyone be interested in supporting it.
-- brion vibber (brion @ pobox.com)
wikitech-l@lists.wikimedia.org