Axel Boldt wrote:
How about this: the possible topics coincide with the major pages listed on [[Main Page]] (from "Astronomy" to "Visual Arts"). The shortest link path from such a topic page to an article defines that article's topic. If there is no such path, then the article is classified as a topic orphan.
I looked a little more into this, manualy tracing the path of 10 randomly chosen articles. I don't know what it does to the automatic path tracing idea but it did lead to a number of observations.
First the data:
1. Abu Zubaydah <- Ibn al-Shaykh al-Libi <- Abu Zubaydah (Circular orphan) nothing else leads to these two (Score 0)
2. Analysis of variance <- Statistics <- Main Page (Score 2)
3. Indianapolis Colts <- National Football League <- American football <- Sport <- Main Page (Score 4) -- ". <- Indiana <- United States (=United States of America) <- List of Countries (=Countries of the World) <- Geography <- Main Page (Score 5) - ", <- 1969 <- 20th century <- Historical timeline|Centuries <- Main Page (Score 4)
4. Jerry Springer <-List of television programs <-List of reference tables (=Reference tables) <- Main Page (Score 3)
5.Heinrich Schliemann <- Archaelogy <- Main Page (Score 2)
6. Hitchhiking <- User: Branko <- Special Pages: Registered Users (=User list) <- Main Page (Score 3) Same via User: Rootbeer Access is only through two user pages; it's an orphaned orphan!
7. Nursing <- Health science <- Main page (Score 2)
8. Vsevolod I, Prince of Kiev <- Kievan Rus' <-History of Russia (=Russian history) <- History <- Main Page (Score 4) A score of 3 was possible through the page [[User: H. Jonat]]. (I swear this was random; I didn't ask for THAT user to appear)
9. Morrisville <- Wikipedia:Links to disambiguating pages <- Wikipedia: utilities <- Main Page (Score 3)
10. Celestial sphere <- Astronomy and astrophysics <- Main Page (Score 2)
Observations: 1. In the samples the longest minimum path to the Main Page was only 4 articles. Any article linked from a user page would be 3 steps away from the user page, but this should not be considered a meaningful path.
2. Two kinds of effectively orphan pages became evident, but these would never appear on the special page listing of orphans. In the first example two pages link to each other but nothing else links to them. In example 6 the only links to the article are on user pages. Who would ever think to look there for a reference to an article?
3. [[List of countries]] and [[United States]] should probably be linked from the Main Page. The numberr of paths through these is enormous.
4. Many of the links to [[United States]] are excessive. Many of the uses are in passing where more information about the United States is unlikely to be needed. I think we can always assume a very basic level of understanding about what is meant by "United States" What would surprise me most about those who don't have that very basic level of understanding is how they managed to find Wikipedia in the first place.
Eclecticology