Thanks Gregor and yes, you are right. I didn't think about your suggestion before, sorry. The fact is that I wrote a script running on the pages-meta-current.xml because it is much smaller and manageable but, you are right: I can use the revision of the page I'm interested that is in pages-meta-history.xml
Thanks for the suggestion!
P.
On Mon, Jul 19, 2010 at 8:59 PM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
2010/7/19 paolo massa paolo@gnuband.org:
I wanted to conduct a longitudinal analysis so having data going back in time up to the first day of wikipedia would be totally awesome! Even at time windows of one year would be enough. And it would be great to have them for different wikipedias (en, de, it, ...)
Why can't you just use the latest full dump? Each dump contains the full history of all articles. Do you need to gather data on deleted articles or something?
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l