Why do you store varchar data at all? It would be much more efficient to use id-to-id maps, no?
Yes, sure. But in order to get this id-to-id map one should first cache name-to-id (pages) because links are stored in id-to-name format. The table caching id-to-name takes sometimes more memory than links itself. But the good thing about it it leaves very short period of time.
This is the essentail conflict: basically, we would have to reserver 1/8 of all resources for your use (well, for use by memory tables - but I doubt anyone besides you used big memory tables).
This figure of 1/8 resources is inaccurate. First, the limit is set just to one table, not for all tables user create. Second, when Golem supposes it will need 4 GB it assumes worst case. Third, Golem at a time works with just one server (when iwiki spy is switched off), so even if it takes 1/8 at one server this means 1/24 of all memory available at s1, s2 and s3.
I can see that it would be much more effort to implement these things by hand, but I don't see why it would be less efficient.
No hand operations at all. I was talking about the fact that data is to be transferred from sql server to client and this transmission takes time. If I need to transmit id-to-id, as I said above I will anyway use a lot of mem to convert to this format. On the other hand, transmission of id-to-name data (assuming it is converted in an app written in C) will take a lot of time.
Speed vs. Memory is the usual tradeoff. We have found that Golem uses too much memory, and of course, the easy way to solve to problem is by using a slower (offline) aproach. I don't see a easy solution for this.
Me too, that's why I think I was doing too much work for nothing.
Anyway, my point is not about the category graph as such. I'm just saying that fast and memory-efficient network analysis is possible with this kind of architecture.
I would agree up to a point of data transmission from sql servers to the application performing analysis. However, all this is just for the only function, for connectivity analysis itself, but there are a lot of other things golem does isolated articles creators stat, suggestions generation etc. All this could require most part of the language metadata to be first downloaded from the sql server.
mashiah