Cormac Lawler wrote:
Jakob wrote:
The only solution is to share your code and data and to frequently publicate results. That's how research works isn't it?. I'm very interested to have a special server for Wikimetrics but someone has to admin it (getting the hardware is not such a problem). For instance I could parse the version history dump to select article, user and timestamp only so other people can analyse which articles are edited at which days or vice versa but I just don't have a server to handle Gigabytes of data. Up to know I only managed to set up a Data Warehouse for Personendaten (http://wdw.sieheauch.de/) but - like most of what's already done - mostly undocumented :-(
It'd be very interesting to see details of your data and methodology - I'm sure that's something that will be of incredible value as we move research forward on Wikipedia. But not just as in a paper where normally you will say "I retrieved this data from an SQL dump of the database" and then do things with the data, what I am looking for, to repeat, is *how you actually do this* from another researcher's point of view.
First I had to rewrite http://meta.wikimedia.org/wiki/Help:Export
Actually I parse the XML export with Joost. But this won't help you much at the moment: http://meta.wikimedia.org/wiki/User:Nichtich/Process_MediaWiki_XML_export
A physical workshop would be much more fruitful I think because it's a lot of work to write HOWTOs :-(
Greetings, Jakob