On Mon, Sep 09, 2002 at 08:34:24PM +0200, Axel Boldt wrote:
I think that a subject classification of articles
would vastly improve
"soft security" and would save regulars a lot of time, since not
everyone would have to check every edit as currently seems to be the
case.
I'd still like to see if we couldn't build
those subjects
automatically in some way based on links in the database.
How about this: the possible topics coincide with the major pages
listed on [[Main Page]] (from "Astronomy" to "Visual Arts"). The
shortest link path from such a topic page to an article defines that
article's topic. If there is no such path, then the article is
classified as a topic orphan.
To compute these topics quickly, the cur table gets two new columns:
topic and distance, where distance stands for the link distance from the
Main Page topic page. If a new article is created, looking at the
distance entries of all articles that link to the new one, and taking
the minimum, immediately classifies the new one. If an existing
article is saved, the topic and distance entries of all articles it
links to (and their children) may need to be updated; these changes
can be propagated in a recursive manner.
Would that work?
1.
No, it wouldn't.
Both deletion and creation of links are hard problems:
Main page -> Biology -> A1
-> Chemistry -> A2 -> A3 -> A4 -> Target
A5 -> Target
Now what happens when we add link from A1 to A5 ?
There are lot of links from A5 to other articles,
recursion here would mean recalculating major part of topology.
Deletion of any of links in current shortest path (if we store it
somewhere) require recalculation of whole topology too.
2.
But it would be possible to create initial classification that way.