On Wed, 2 Mar 2005 20:14:25 +0000, Rowan Collins rowan.collins@gmail.com wrote:
On Wed, 2 Mar 2005 13:35:06 -0600, Richard Holton richholton@gmail.com wrote:
In looking through the MediaWiki schema (both current and new), I've noticed that page titles are given a max length of 255 chars. However, it seems that in some cases, this title includes the namespace, and in others it does not -- namespace is stored as an Int or is implied by context (eg, categorylinks, imagelinks).
Hm, I see what you mean - there aren't that many places where it's a problem, but certainly 'brokenlinks' has the namespace as part of the [destination] title. So it seems an article could have 255 characters
- a namespace (because the namespace isn't considered part of the
title) and not fit in brokenlinks (because that just stores the text of the link, rather than a namespace and title).
There's been talk of merging the various links tables to all be id->name (rather than some being id->id), because the text of the link doesn't change, but the article it refers to might. This problem could be addressed by anyone implementing that.
The disadvantage of using {namespace_as_int, title_as_text} for link targets is that this doesn't reflect how they're entered: [[Foo:Bar]] could change in meaning from {0, "Foo:Bar"} to {20, "Bar"} if a custom "Foo" namespace was created; the two forms could not, however, co-exist. This suggests to me that it would be better to just make the link_to field wider than page_title (i.e. a width of 255 + a constant MAX_NAMESPACE_LENGTH), and retain the current practice of storing the destination as one string.
I notice that in the new schema, the 'page' table uses the {namespace_as_int, title_as_text} form, and it doesn't save the namespace within the title. (Was that true of the old schema as well?)
I don't want to second-guess the new schema. It does seem that the link tables should use the same method of identifying pages as the 'page' table does.
For 'categorylinks', having the namespace in the index would allow fast separation by namespace.
-Rich Holton en.wikipedia:User:Rholton