-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
River Tarnell:
Could you (or someone) describe exactly what Golem does, and provide an example of its output format?
Okay, so based on lvova's presentation, it seems like it does this:
* Build a graph of Wikipedia articles in the main namespace, with wikilinks as vertexes. Since some pages are not reachable from other pages, this is actually N disconnected graphs. * Remove all edges which refer to disambiguation pages, date pages, or lists * Remove the graph which contains the main page * Produce a list of all remaining graphs.
Is that roughly correct?
- river.