On 23/08/07, Brion Vibber brion@wikimedia.org wrote:
The extreme slowness which leads to the restriction and caching comes from operating on the entire database at once.
For fetching the current state of a single page, that's a small, quick query. Doing fifty or a hundred of them in a list isn't super-efficient, but it doesn't bring the server to its knees like scanning through an entire table of millions of pages.
One perhaps might be more clever by removing entries from the cached list when they no longer apply (and maybe even re-adding those which become relevant when their state changes).
This might be something we could explore, by introducing a "sub update" which scans certain reports and removes irrelevant entries. By your argument above, this wouldn't be particularly expensive.
Then again, from the user point of view, it might be handy to have more frequent general updates. Could we divide the reports up into a list of those which can be updated more frequently, and do so?
That's only one part of the problem; marking no-longer-relevant entries is also needed, either by strikeout or removing them or whatever.
In the light of this, would you object to striking out, rather than completely removing, those entries which seem to be fixed? This would fix the paging behaviour once more, and we can work on fancier solutions later.
Rob Church