On 8/14/07, Anthony wikimail@inbox.org wrote:
On 8/14/07, David Gerard dgerard@gmail.com wrote:
bzip2recover. Genius. I've been wanting to do something like this for a long time, and the one thing standing in my way was that I couldn't figure out how to do the random access bit.
It appears to have a few flaws, though:
http://slashdot.org/comments.pl?sid=268617&cid=20222493
Basically, the approach of splitting the database into 900 kB chunks means that you may end up splitting the XML headers between chunks, making indexing miss a few articles.