On Mon, Sep 09, 2002 at 08:34:24PM +0200, Axel Boldt wrote:
I think that a subject classification of articles would vastly improve "soft security" and would save regulars a lot of time, since not everyone would have to check every edit as currently seems to be the case.
I'd still like to see if we couldn't build those subjects automatically in some way based on links in the database.
How about this: the possible topics coincide with the major pages listed on [[Main Page]] (from "Astronomy" to "Visual Arts"). The shortest link path from such a topic page to an article defines that article's topic. If there is no such path, then the article is classified as a topic orphan.
To compute these topics quickly, the cur table gets two new columns: topic and distance, where distance stands for the link distance from the Main Page topic page. If a new article is created, looking at the distance entries of all articles that link to the new one, and taking the minimum, immediately classifies the new one. If an existing article is saved, the topic and distance entries of all articles it links to (and their children) may need to be updated; these changes can be propagated in a recursive manner.
Would that work?
1. No, it wouldn't.
Both deletion and creation of links are hard problems:
Main page -> Biology -> A1 -> Chemistry -> A2 -> A3 -> A4 -> Target A5 -> Target
Now what happens when we add link from A1 to A5 ? There are lot of links from A5 to other articles, recursion here would mean recalculating major part of topology.
Deletion of any of links in current shortest path (if we store it somewhere) require recalculation of whole topology too.
2. But it would be possible to create initial classification that way.