On Fri, Oct 10, 2008 at 10:09 PM, Nicolas Dumazet nicdumz@gmail.com wrote:
Hey !
May I mention that the scripts generating the dumps and handling the scheduling are written in Python and are available on wikimedia svn ? [1]
Well, you can, but I already knew that.
If you have some improvements to suggest on the task scheduling, I
guess that patches are welcome :)
Well, I don't know Python, and I'd advocate rewriting the dump system from scratch anyway, but 1) I'd really need access to the SQL server in order to do that; and 2) If I put that much work into something I need some sort of financial reward. Hiring me and/or paying for my family's health care is welcome as well.
I'm actually working on redoing the full history bz2 dump as a bunch of smaller bz2 files (of 900K or less uncompressed text each) so they can be accessed randomly without losing the compressing. But it's going to take a while for me to complete it, since I don't have a very fast machine or hard drives, and I don't have a lot of time to spend on it since working on it has little potential to feed, clothe, or shelter my family. And when I finish it, I'm probably not going to give it away for free, on the off chance that maybe I can sell it to buy my daughter diapers or buy my son milk or something.
I'm a terrible person, aren't I?