Hi all,
Congrats Ariel! :) The sum of pages-meta-history files for the last two
enwiki dumps are 342.7GB for the 20110115 dump and 353.5GB for the
20110317 dump, which shows that the overall dump size grew over 2
months. Seven of the individually numbered pages-meta-history files
reduced in size while eight increased in size from 20110115 to
20110317. By far the biggest decrease was the
pages-meta-history10.xml.bz2 file which dropped from 18.7GB down to
1.9GB. I think there is probably missing revisions in that page ID
range.
Here are some historical dumps sizes for comparison to show the growth of these files:
enwiki-20060816-pages-meta-history.xml.7z 5.08GB
enwiki-20070402-pages-meta-history.xml.7z 11.3GB (229 days since previous dump)
enwiki-20080103-pages-meta-history.xml.7z 17.2GB (276 days since previous dump)
enwiki-20100130-pages-meta-history.xml.7z 31.8GB (758 days since previous dump)
enwiki-20110115-pages-meta-history[1-15].xml.7z 38.0GB (350 days since previous dump)
enwiki-20110115-pages-meta-history[1-15].xml.7z (7z compression in progress)
Here's a graph of this data showing the dump file size growth seems to be pretty linear:
(chart x-axis starts from 20060816 dump and ends at 20110115 dump)
"http://nekrom.com/wikipedia/enwiki%20history%20dump%20file%20size%20over%20time.png"
cheers,
Jamie
----- Original Message -----
From: "Ariel T. Glenn" <ariel@wikimedia.org>
Date: Tuesday, March 29, 2011 3:24 pm
Subject: [Xmldatadumps-l] March 17 en wikipedia history bz2 files ready
To: xmldatadumps-l@lists.wikimedia.org
Cc: wikitech-l@lists.wikimedia.org
> Well, that used up all my good luck for the year, but the bz2s
> are ready
> for download. The md5sums are still calculating, give them
> a couple
> hours to show up. If all continues to go well we'll have
> the 7z files
> in 4-5 days.
>
> As before I do not plan to provide a single 350gb file of the
> bz2, nor a
> single 7z file for download.
>
> Happy trails,
>
> Ariel
>
>
> _______________________________________________
> Xmldatadumps-l mailing list
> Xmldatadumps-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>