Hi!
I'd like to do some statistics based on the database dumps, so I'd need a lot of more disk quota (50GB would be fine or at least 2GB for the cur-dumps :-). It's better to only mirror dump files at one place so I created the directory
/u01/u/voj/dumps
that is also accesible via
http://tools.wikimedia.de/~voj/dumps/
Unfourtunately I have not been able to add header and footer to the the directory listing - I made .htaccess and readme readable and writable for the whole users group - maybe someone can help? We can also move the dumps directory to another location.
Since you can generate a huge amount of data when analysing the dumps I wrote a lot of scripts to pipe the date from one task to another. Here is first example:
/u01/u/voj/dumps/tools/catlinks.pl wp eo 20051009 | wc -l
By the way it's much more faster to bulk import tab-seperated data into a MySQL database than using INSERT statements.
Greetings, Jakob