Simetrical schreef:
No filesorts, since nothing has to be sorted (no ORDER BY or GROUP BY). It will just scan every single categorylinks row corresponding to the given category. It won't be able to use an index for the entire query, either: it will have to hit the page table data blocks. Probably it will take 30 seconds or more on Wikimedia servers for Living people or other large categories, given that just scanning the index for all those rows (no joins, no hitting data blocks) is five seconds. Could be a few minutes, I guess. Either way it's not remotely acceptable.
That such things are possible to do with the current database is obvious. After all, we store all the relevant information: there's got to be some way to retrieve it, somehow, even if it requires querying all tables in their entirety and manually processing every row in PHP. So the only real question in these things is efficiency.
That's why I said we probably do want a cl_is_category boolean field. Of course filling it would be a huge operation on enwp.
Roan Kattouw (Catrope)