On 2010-03-30 09:21, Mashiah Davidson wrote:
DaB,
why do you think that you can use 1/8 of the memory of the server alone? We have a lot more then 8 users so it makes me wonder, how 1 user thinks he can use 1/8 and the other few dozenth the rest.
The solution you suggested was tested by me on initial implementation phase and found it too slow.
Hello, I don't know the details of your algorithm and apologies for assuming ignorance on your part, but did you keep in mind the mantra of databases? It goes something like this (imagine Steve Ballmer saying this): Indexes! Indexes! Indexes! (Or is it "indices"?) I too had a serious query complexity problem a while back - that was with disambiguation pages with links (and related queries such as pages with most DPLs, templates linking to DPLs etc.), which entails even 5- or 6-table joins. When I tried running it as one query, it would get killed all the time, except on very small samples. Then I started storing intermediate results in temporary (albeit "physical") tables, *heavily* indexed (even on fields that are not normally indexed in a mediawiki database) - now several reports get generated in a matter of minutes (for pl.wiki, which is quite a biggie).
Of course, indexes take disk space, but that is quite plentiful on the toolserver, and with that space you should be able to buy speed for your algorithm.
Regards, Misza