- Build a graph of Wikipedia articles in the main namespace, with
wikilinks as vertexes. Since some pages are not reachable from other pages, this is actually N disconnected graphs.
- Remove all edges which refer to disambiguation pages, date pages, or
lists
- Remove the graph which contains the main page
- Produce a list of all remaining graphs.
Is that roughly correct?
It is roughly correct description of one of Golem's processing stages.
It makes this operation for both, main namespace and category tree. The second is to get info on cycles in it.
It also monitors a number of categories with isolated articles of various types and output files containing new isolates for each type. Those files the used with AWB for articles templating (inclusion to categories containing isolated articles).
It also generates lists of pages containing links to disambiguation pages and list of most linked disambiguation pages.
For each isolated article it tries to find suggestions for linking of three various types: 1. if an isolated article is linked from a disambiguation page and it is linked from another article, it suggests to check wether the link is to go directly to isolated article 2. if an isolated article has interwiki link and its iwiki-partner is linked in another language and the page linking it has backward link to mother language, it suggests to improve an existent article. 3. if in the above chain the article does not have backward link to mother language, it suggests to translate and link.
All the suggestions are present on ts web page and templates mentioned above provide access from article in wiki to suggestions list on ts web server.
It also creates a list of users by amount of isolated articles created (on the web page)
It also creates a list of isolated articles by creation date (old isolates most probably have lost creator's attention).
Depeniding on existence of additional configuration it also allows to see which templates link disambiguation pages.
mashiah