What is the intended format of the dump files? The page makes it sound like
it will have a binary format, which I'm not opposed to, but is definitely
something you should decide on.
Also, I really like the idea of writing it in a low level language and then
having bindings for something higher. However, unless you plan of having
multiple language bindings (e.g., *both* C# and Python), you may want to
pick a different route. For example, if you decide to only bind to Python,
you can use something like Cython, which would allow you to write
pseudo-Python that is still compiled to C. Of course, if you want multiple
language bindings, this is likely no longer an option.
Stevens Institute of Technology, Class of 2016
Major in Computer Science
On Mon, Jul 1, 2013 at 10:00 AM, Petr Onderka <gsvick(a)gmail.com> wrote:
For my GSoC project Incremental data dumps ,
I'm creating a new file
format to replace Wikimedia's XML data dumps.
A sketch of how I imagine the file format to look like is at
What do you think? Does it make sense? Would it work for your use case?
Any comments or suggestions are welcome.
Wikitech-l mailing list