Why do you store varchar data at all? It would be much
more efficient to use
id-to-id maps, no?
Yes, sure. But in order to get this id-to-id map one should first
cache name-to-id (pages) because links are stored in id-to-name
format. The table caching id-to-name takes sometimes more memory than
links itself. But the good thing about it it leaves very short period
of time.
This is the essentail conflict: basically, we would
have to reserver 1/8 of all
resources for your use (well, for use by memory tables - but I doubt anyone
besides you used big memory tables).
This figure of 1/8 resources is inaccurate. First, the limit is set
just to one table, not for all tables user create. Second, when Golem
supposes it will need 4 GB it assumes worst case. Third, Golem at a
time works with just one server (when iwiki spy is switched off), so
even if it takes 1/8 at one server this means 1/24 of all memory
available at s1, s2 and s3.
I can see that it would be much more effort to
implement these things by hand,
but I don't see why it would be less efficient.
No hand operations at all. I was talking about the fact that data is
to be transferred from sql server to client and this transmission
takes time. If I need to transmit id-to-id, as I said above I will
anyway use a lot of mem to convert to this format. On the other hand,
transmission of id-to-name data (assuming it is converted in an app
written in C) will take a lot of time.
Speed vs. Memory is the usual tradeoff. We have found
that Golem uses too much
memory, and of course, the easy way to solve to problem is by using a slower
(offline) aproach. I don't see a easy solution for this.
Me too, that's why I think I was doing too much work for nothing.
Anyway, my point is not about the category graph as
such. I'm just saying that
fast and memory-efficient network analysis is possible with this kind of
architecture.
I would agree up to a point of data transmission from sql servers to
the application performing analysis. However, all this is just for the
only function, for connectivity analysis itself, but there are a lot
of other things golem does isolated articles creators stat,
suggestions generation etc. All this could require most part of the
language metadata to be first downloaded from the sql server.
mashiah