Re: [Wikitech-l] How long can a title be? Database issue.

3 Mar 2005


      On Thu, 3 Mar 2005 10:05:15 -0600, Richard Holton richholton@gmail.com wrote:
...
Another way to think about things: a category link is like a two-way
link. It links both from the page to the category, and from the
category to the page.
...
I definitely see the similarity between "what
links here" and categories. In that sense, each page becomes its own
category, with any link to the page a 'category entry'.
Yes, that's the concept behind "WikiCategories" on simple wikis like
UseMod - the backlinks are the member pages. e.g.
http://www.usemod.com/cgi-bin/mb.pl?CategoryMeatball, the contents of
which just tells you to look at its backlinks.
...
Perhaps we
could use this relationship to improve the 'what links here' display.
That would indeed be incredibly useful - that page is hideously
presented, has an arbitrary hard-coded cut-off and is unsorted. If we
could use the same display style (or even some shared super-class) as
the category listings, that would be a huge improvement.
It almost makes me wonder if all links should store a from_namespace
and sortkey, so that we can do the same paging and display seperation
tricks - but that would of course require massive updates on every
page move, so probably isn't going to happen.
...
...
... Of course, it suffers from a variant of the update problem that the
current id->id based links table does: a page can be moved from one
namespace to another, taking its category links with it.
...
Yes, that is the trade off. However, when a category page moves, it
only has to update its own categorylink entries -- one for each each
[[category:xxx]] on that page.
It's not moving the *category* that's the main problem, but moving the
*content*, since that could cross namespaces. I think you did get my
point, it's just this sentence is either a typo or a non sequitur (or
both) ;-)
...
Note that currently, the cl_sortkey
field is updated on any page move for those links that don't have a
specified sort key.
Ah, I didn't realise that; but I guess it makes sense - especially
once you get into the listing needing to be split into pages.
...
Currently, to find subcategories for a category page, the links have
to be retrieved and separated by namespace in PHP.
I'm not that hot on SQL (serves me right for not studying straight CS
at uni), but isn't it possible to do something like "SELECT <stuff>
FROM categorylinks, page WHERE cl_to=<whatever> AND
page_namespace=NS_CATEGORY LIMIT 200", thus avoiding the problem
you're describing?
I've seen things that look similar to this in the code, but it could be that
1) I have misunderstood, and this particular one's impossible
2) this would in fact be possible, but prohibitively
expensive/inefficient, and therefore it's been discarded as an option
3) it would amount to exactly the same activity as doing it as a
sequence of queries with PHP in between
...
...
There's an additional problem with merging the categorylinks table in
with the others anyway, in that it has extra fields cl_sortkey and
cl_timestamp that the others don't have or need. That's a shame,
because it would be neat, if redesigning the tables, not to have to
leave that one out.
By the way, cl_timestamp is currently unused (I believe). It can be
handy for debugging, but otherwise I don't know it's intended purpose.
Hm. I thought maybe it was intended that additions and removals from a
category could be presented in a history display of some sort, or on
watchlists etc. But a timestamp would only allow listing of additions,
so I'm not sure that can be right.
Still, unless there's a sensible way of reusing the sortkey field in
other link-types, it's probably not a good idea to merge category
links in with everything else. [To store the namespace_from, you could
just reuse the otherwise redundant namespace_to field, but nothing
springs to mind that could double up with the sortkey in this way]
...
...

l_is_broken (boolean)

--- to allow everything that brokenlinks currently does; this should
be kept seperate from the link type, because if you try to {{include}}
a non-existent page, it's both "broken" and a "template" link, and
both facts are potentially useful
I don't fully understand the use of a "broken" field. Does this
eliminate the need for verifying page existence during rendering?
Maintaining "broken" requires a potentially large update for each page
creation, deletion, or move.
Well, there must be some reason we currently store "brokenlinks"
separate from "links", right? I'm not sure the distinction is used
when rendering, that seems to be done by looking at the page table,
but there's an awful lot of deferral and caching, so maybe it is. [Or,
alternatively, maybe it should be! I don't know.] It's used by
utilities like Special:Wantedpages, I know that much. It certainly
seems to make sense to store the distinction.
As for updates - yes, creation or deletion will have a big impact, but
all the linking pages need their cache invalidating anyway, because
they'll have the wrong colour links, so it's always going to be a big
deal. And note that *moving* a page doesn't break any links - there's
still a page at the old location, albeit a brand new redirect.
Somebody might delete that redirect later, but that's a seperate
action.
-- 
Rowan Collins BSc
[IMSoP]

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] How long can a title be? Database issue.