On dim, 2002-05-12 at 08:17, Magnus Manske wrote:
I suggest to add two new fields to the database (cur table)
- cur_namespace (varchar) so articles can be distinguished by namespace
without using the "LIKE" MySQL query (this was mentioned before)
Should there be a separate title-without-namespace field as well, or would cur_title now be that?
I think a namespace field would do fine for now, as most of the slow queries look for the namespace.
Well, the statistics checks do I suppose, but the watchlist and list-all-namespaces functions check for everything with the same non-namespace portion. No?
- cur_is_redirect (boolean) so #REDIRECT pages can be excluded likewise
Not a bad idea! Might it be even easier to make it a string field which contains the name of the page being redirected to? This would save parsing the #REDIRECT [[x]] line or relying on the link tables when doing mass checks in, eg, "links to this page".
I agree. A boolean field might be faster, but not by much.
Concerning the {{NUMBEROFARTICLES}} function, we should implement a cur_has_commas (boolean) field while we're at it. It seems to be the best way of telling articles apart from stubs. Or, we should name it cur_is_article, in case we come up with a better method some day.
Or, we could just update the value once a day. :)
Actually, that reminds me: should we add a timestamp field to indicate when cur_cache was last filled? A simple check of that against cur_timestamp would have prevented our present cache-of-ancient- version-is-displayed-instead-of-recent-one troubles.
It would then be possible to expire caches on certain pages, such as those containing {{variables}}.
-- brion vibber (brion @ pobox.com)
wikitech-l@lists.wikimedia.org