On 23/08/07, Brion Vibber <brion(a)wikimedia.org> wrote:
The extreme slowness which leads to the restriction
and caching comes
from operating on the entire database at once.
For fetching the current state of a single page, that's a small, quick
query. Doing fifty or a hundred of them in a list isn't super-efficient,
but it doesn't bring the server to its knees like scanning through an
entire table of millions of pages.
One perhaps might be more clever by removing entries from the cached
list when they no longer apply (and maybe even re-adding those which
become relevant when their state changes).
This might be something we could explore, by introducing a "sub
update" which scans certain reports and removes irrelevant entries. By
your argument above, this wouldn't be particularly expensive.
Then again, from the user point of view, it might be handy to have
more frequent general updates. Could we divide the reports up into a
list of those which can be updated more frequently, and do so?
That's only one part of the problem; marking
no-longer-relevant entries
is also needed, either by strikeout or removing them or whatever.
In the light of this, would you object to striking out, rather than
completely removing, those entries which seem to be fixed? This would
fix the paging behaviour once more, and we can work on fancier
solutions later.
Rob Church