Hi,
I was wondering on the order/sorting of revisions inside the pages-meta-history dumps, especially with respect to the namespaces. Does the order of revisions in the dumps account for namespaces (e.g. are revisions from the Template namespace located towards the end of the dump?) or is the order bound to any other parameter which potentially influences the location of revisions from certain namespaces? I'm currently processing the (March 2013) dewiki dump.
Regards, Johannes
--- Johannes Daxenberger Doctoral Researcher | IT Administration Ubiquitous Knowledge Processing (UKP Lab) FB 20 Computer Science Department Technische Universität Darmstadt Hochschulstr. 10, D-64289 Darmstadt, Germany email: daxenberger(at)ukp.informatik.tu-darmstadt.de phone: [+49] (0)6151 16-6227, fax: -5455, room: S2/02/B111 www.ukp.tu-darmstadt.dehttp://www.ukp.tu-darmstadt.de/ Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.dehttp://www.werc.tu-darmstadt.de/
Στις 29-07-2013, ημέρα Δευ, και ώρα 15:12 +0000, ο/η Johannes Daxenberger έγραψε:
Hi,
I was wondering on the order/sorting of revisions inside the pages-meta-history dumps, especially with respect to the namespaces. Does the order of revisions in the dumps account for namespaces (e.g. are revisions from the Template namespace located towards the end of the dump?) or is the order bound to any other parameter which potentially influences the location of revisions from certain namespaces?
I’m currently processing the (March 2013) dewiki dump.
Pages are dumped by ascending page order, and within those the revisions are by ascending revision id or timestamp, depending on how lucky you are. Page titles and namespaces don't play any role in this.
Ariel
xmldatadumps-l@lists.wikimedia.org