Heya Diederik,
On Tue, Mar 29, 2011 at 7:58 PM, Diederik van Liere dvanliere@gmail.com wrote:
I do not think that getting the data in sqlite format is going to be very valuable. People can already get the data in Mysql databases (although that is not that easy either) and so getting it in sqlite will not give additional benefits in terms of querying capabilities. I am also not sure if sqlite can handle such large databases.
True, but it's not one large sqlite database - it'll be split across multiple smaller ones, and explicit pointers will be maintained so that random access is as effecient as possible.
What I do think might be valuable is to work on a text format (JSON, CSV) to store the dumps. The reason is that we are looking at a Nosql datastore solution (for example Hadoop) and storing the data in a non-xml but still text format is going to be really useful.
Interesting. How exactly would having JSON/CSV be better than XML from a import-into-nosql-datastore perspective? \