[Toolserver-l] Static dump of German Wikipedia

Marco Schuster marco at harddisk.is-a-geek.org
Mon Sep 27 22:53:16 UTC 2010


Hi all,

just a quick status update: The dump is currently running at 2req/s
and ignores all pages which have is_redirect set; also I changed the
storage method: the new files are appended to
/mnt/user-store/dewiki_static/articles.tar, as I noticed I was filling
up the inodes of the file system; doing the storage inside a tarball
will prevent this and I don't have to waste time downloading the tons
of files to my PC, only one huge tarball when its done.
I also managed to get a totally stripped down version of the Vector
skin file loading an article via JSON (I won't release it now though,
it's a damn hack - nothing except loading works, as I have removed
every JS file... should be pretty till Sunday).
Current dump position is at 92927, stripping out the redirects 53171
articles have really been downloaded, resulting in 770MB of
uncompressed tar (I expect gzip or bz2 compression to save lots of
space though).
For the redirects: how do I get the redirect target page (maybe even
the #section)?

Marco

PS: Are there any *fundamental* differences between the Vector skin
files of different languages except the localisation? Could this maybe
be converted to Javascript, maybe $("#footer-info-lastmod").html("page
was last changed at foobar")?


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de



More information about the Toolserver-l mailing list