* Build a graph of Wikipedia articles in the main
namespace, with
wikilinks as vertexes. Since some pages are not reachable from other
pages, this is actually N disconnected graphs.
* Remove all edges which refer to disambiguation pages, date pages, or
lists
* Remove the graph which contains the main page
* Produce a list of all remaining graphs.
Is that roughly correct?
It is roughly correct description of one of Golem's processing stages.
It makes this operation for both, main namespace and category tree.
The second is to get info on cycles in it.
It also monitors a number of categories with isolated articles of
various types and output files containing new isolates for each type.
Those files the used with AWB for articles templating (inclusion to
categories containing isolated articles).
It also generates lists of pages containing links to disambiguation
pages and list of most linked disambiguation pages.
For each isolated article it tries to find suggestions for linking of
three various types:
1. if an isolated article is linked from a disambiguation page and it
is linked from another article, it suggests to check wether the link
is to go directly to isolated article
2. if an isolated article has interwiki link and its iwiki-partner is
linked in another language and the page linking it has backward link
to mother language, it suggests to improve an existent article.
3. if in the above chain the article does not have backward link to
mother language, it suggests to translate and link.
All the suggestions are present on ts web page and templates mentioned
above provide access from article in wiki to suggestions list on ts
web server.
It also creates a list of users by amount of isolated articles created
(on the web page)
It also creates a list of isolated articles by creation date (old
isolates most probably have lost creator's attention).
Depeniding on existence of additional configuration it also allows to
see which templates link disambiguation pages.
mashiah