What is the intended format of the dump files? The page makes it sound like
it will have a binary format, which I'm not opposed to, but is definitely
something you should decide on.
Also, I really like the idea of writing it in a low level language and then
having bindings for something higher. However, unless you plan of having
multiple language bindings (e.g., *both* C# and Python), you may want to
pick a different route. For example, if you decide to only bind to Python,
you can use something like Cython, which would allow you to write
pseudo-Python that is still compiled to C. Of course, if you want multiple
language bindings, this is likely no longer an option.
*-- *
*Tyler Romeo*
Stevens Institute of Technology, Class of 2016
Major in Computer Science
www.whizkidztech.com | tylerromeo(a)gmail.com
On Mon, Jul 1, 2013 at 10:00 AM, Petr Onderka <gsvick(a)gmail.com> wrote:
For my GSoC project Incremental data dumps [1],
I'm creating a new file
format to replace Wikimedia's XML data dumps.
A sketch of how I imagine the file format to look like is at
http://www.mediawiki.org/wiki/User:Svick/Incremental_dumps/File_format.
What do you think? Does it make sense? Would it work for your use case?
Any comments or suggestions are welcome.
Petr Onderka
[[User:Svick]]
[1]:
http://www.mediawiki.org/wiki/User:Svick/Incremental_dumps
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l