On 2010-03-30 09:21, Mashiah Davidson wrote:
DaB,
why do you think that you can use 1/8 of the
memory of the server alone? We
have a lot more then 8 users so it makes me wonder, how 1 user thinks he can
use 1/8 and the other few dozenth the rest.
The solution you suggested was tested by me on initial implementation
phase and found it too slow.
Hello,
I don't know the details of your algorithm and apologies for assuming
ignorance on your part, but did you keep in mind the mantra of databases?
It goes something like this (imagine Steve Ballmer saying this):
Indexes! Indexes! Indexes!
(Or is it "indices"?)
I too had a serious query complexity problem a while back - that was
with disambiguation pages with links (and related queries such as pages
with most DPLs, templates linking to DPLs etc.), which entails even 5-
or 6-table joins.
When I tried running it as one query, it would get killed all the time,
except on very small samples.
Then I started storing intermediate results in temporary (albeit
"physical") tables, *heavily* indexed (even on fields that are not
normally indexed in a mediawiki database) - now several reports get
generated in a matter of minutes (for pl.wiki, which is quite a biggie).
Of course, indexes take disk space, but that is quite plentiful on the
toolserver, and with that space you should be able to buy speed for your
algorithm.
Regards,
Misza