>
And the ids of categories overlap with the ids of pages.
Categories don't actually have ids. Categories are more like tags in that they exist as soon as a page is "linked" to one. Many categories have corresponding pages in the "Category" namespace that describe them, but a category "exists" before a page is created.
> The best thing, from a computational perspective, is that if n is the number of pages plus the number of category pages every page or category page is assigned a node number in the interval [0..n).
Surely you can build a hash map on whatever unique identifier you like and get constant (amortized) lookup speed.
I think it is best if we provide you with a raw format that will work and you do your own post processing to obtain the "id space" that you like.
-Aaron