2009/9/18 Erik Zachte erikzachte@infodisiac.com:
I think it is extremely important to keep these files for later analysis by historians and others.
Mathias Schindler also keep an archive or at least did till April (Berlin conference). He even bought a dedicated external drive for it.
Right now, I have a single copy of all the files from December 2007 to April 2009 on a single hard drive. I haven't done any integrity checks beyond some initial tests. The dataset has some missing spots when the service to produce the files was not working. In some cases, it is just an empty .gz file, in some cases there was no file produced at all.
In my spare time, I will try to load the files from May to now to this hard drive until it is full.
The situation is rather uncomfortable for me since I am in no way able to guarantee the integrity and safety of these files for a longer time frame. While I might continue downloading and "storing" the files, I would be extremely happy to hear that the full and unabridged set of files is available a) to anyone b) for an indefinite time span c) free of charge d) with some backup and data integrity check in place.
Speaking of wish lists, a web-accessible service to work with the data would be nice. We know for sure that journalists and hopefully some more demographics like the data, numbers and resulting shiny graphs.
Mathias