Hey,
today we did database dumps, but here we try to offload our file server albert as much as possible, therefore I started some experimental dumps service.
http://dumps.wikimedia.org/ has the newest dumps now. As soon as we make the view more friendly and functional, it might be the one replacing http://download.wiki...
Cheers, Domas
On Apr 6, 2005 11:45 PM, Domas Mituzas Domas.Mituzas@microlink.lt wrote:
today we did database dumps, but here we try to offload our file server albert as much as possible, therefore I started some experimental dumps service.
http://dumps.wikimedia.org/ has the newest dumps now. As soon as we make the view more friendly and functional, it might be the one replacing http://download.wiki...
I took a brief look at the cur en.wikipedia dump, and it would appear that it's uncompressed, unlike the old versions which were bzip2 compressed. As the latter format seems to be around 70% of the size of the uncompressed, shouldn't the new download page offer compressed versions? Or do you plan to use rsync as the preferred download strategy?
John Fader wrote:
I took a brief look at the cur en.wikipedia dump, and it would appear that it's uncompressed, unlike the old versions which were bzip2 compressed. As the latter format seems to be around 70% of the size of the uncompressed, shouldn't the new download page offer compressed versions?
The new dumps are compressed with gzip as it's faster than bzip2 (though bzip2 compresses better).
Your browser might be uncompressing it for you (naughty browser!)
-- brion vibber (brion @ pobox.com)
On Apr 7, 2005 12:53 AM, Brion Vibber brion@pobox.com wrote:
The new dumps are compressed with gzip as it's faster than bzip2 (though bzip2 compresses better).
Ah-ha. Thanks.
Your browser might be uncompressing it for you (naughty browser!)
Hmm, the content-type setting is somewhat suboptimal -
dumps.wikimedia.org sends: Content-Type: text/plain; charset=iso-8859-1 Content-Encoding: gzip
Whereas most servers sending .gz files send: Content-Type: application/x-gzip Content-Encoding: x-gzip
A brief check shows Firefox 1.02 and Opera 7.52u2 and whatever XP's latest version of IE6 all show the ungzipped content inline when sent the text/plain content-type. I guess a tweak to httpd.conf is in order.
John Fader wrote:
On Apr 7, 2005 12:53 AM, Brion Vibber brion@pobox.com wrote:
Your browser might be uncompressing it for you (naughty browser!)
Hmm, the content-type setting is somewhat suboptimal -
dumps.wikimedia.org sends: Content-Type: text/plain; charset=iso-8859-1 Content-Encoding: gzip
Bleaaaaaaah. Ok, that's wrong. In that case it's correct for your browser to decompress the stuff, but, um, it shouldn't be doing that. :D
Domas seems to be experimenting with thttpd for that server, so I'll let him reconfigure it when he's awake again.
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
The new dumps are compressed with gzip as it's faster than bzip2 (though bzip2 compresses better).
You might already be aware of this, but for the purposes of preventing misinformation from spreading, please allow me to point out that bzip2 does *not* generally compress better -- it only compresses natural-language text and source code better, but not, for example, the SQL dump of the links table, which contains mostly numbers.
Timwi
You might already be aware of this, but for the purposes of preventing misinformation from spreading, please allow me to point out that bzip2 does *not* generally compress better -- it only compresses natural-language text and source code better
...Which is the stuff people generally want to compress, however I generally use gzip since I generally don't think the cpu trade off is worth the small size reduction.
On Apr 6, 2005 11:45 PM, Domas Mituzas Domas.Mituzas@microlink.lt wrote:
Hey,
today we did database dumps, but here we try to offload our file server albert as much as possible, therefore I started some experimental dumps service.
Also, the class auxtables is entirely unreadable (at least on my firefox install). Rather than
.auxtables { font-size: 6pt; }
can I suggest
.auxtables { font-size: small; }
(sorry to be such a nit-picker) John Fader
Hello!
Will you put there dumps of wikisource and wikispecies?
Domas Mituzas wrote:
Hey,
today we did database dumps, but here we try to offload our file server albert as much as possible, therefore I started some experimental dumps service.
http://dumps.wikimedia.org/ has the newest dumps now. As soon as we make the view more friendly and functional, it might be the one replacing http://download.wiki...
Cheers, Domas
Please add a direct link to the download area for each one e.g. to http://dumps.wikimedia.org/special/commons/ for Commons.Special
Domas Mituzas schrieb:
today we did database dumps, but here we try to offload our file server albert as much as possible, therefore I started some experimental dumps service.
That looks really good, so thanks for your efforts.
Are there any ideas for some kind of incremental dump? Or are bandwith and disk storage no problem compared to the more complex implementation of incremental dumps? And are there any statistics about how much bandwidth is used by downloads and database dumps (i.e. not normal visitor traffic)?
Ciao, Michael.
wikitech-l@lists.wikimedia.org