What is the intended format of the dump files? The page makes it sound like it will have a binary format, which I'm not opposed to, but is definitely something you should decide on.

Also, I really like the idea of writing it in a low level language and then having bindings for something higher. However, unless you plan of having multiple language bindings (e.g., *both* C# and Python), you may want to pick a different route. For example, if you decide to only bind to Python, you can use something like Cython, which would allow you to write pseudo-Python that is still compiled to C. Of course, if you want multiple language bindings, this is likely no longer an option.

-- 
Tyler Romeo
Stevens Institute of Technology, Class of 2016
Major in Computer Science
www.whizkidztech.com | tylerromeo@gmail.com


On Mon, Jul 1, 2013 at 10:00 AM, Petr Onderka <gsvick@gmail.com> wrote:
For my GSoC project Incremental data dumps [1], I'm creating a new file
format to replace Wikimedia's XML data dumps.
A sketch of how I imagine the file format to look like is at
http://www.mediawiki.org/wiki/User:Svick/Incremental_dumps/File_format.

What do you think? Does it make sense? Would it work for your use case?
Any comments or suggestions are welcome.

Petr Onderka
[[User:Svick]]

[1]: http://www.mediawiki.org/wiki/User:Svick/Incremental_dumps
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l