As a regular of user of dump files I would not want a "fancy" file
format with indexes stored as trees etc.
I parse all the dump files (both for SQL tables and the XML files) with
a one pass parser which inserts the data I want (which sometimes is only
a small fraction of the total amount of data in the file) into my local
database. I will normally never store uncompressed dump files, but pipe
the uncompressed data directly from bunzip or gunzip to my parser to
save disk space. Therefore it is important to me that the format is
simple enough for a one pass parser.
I cannot really imagine who would use a library with object oriented API
to read dump files. No matter what it would be inefficient and have
fewer features and possibilities than using a real database.
I could live with a binary format, but I have doubts if it is a good
idea. It will be harder to take sure that your parser is working
correctly, and you have to consider things like endianness, size of
integers, format of floats etc. which give no problems in text formats.
The binary files may be smaller uncompressed (which I don't store
anyway) but not necessary when compressed, as the compression will do
better on text files.