[Toolserver-l] Golem issues
River Tarnell
river.tarnell at wikimedia.de
Wed Mar 31 15:25:26 UTC 2010
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
River Tarnell:
> Could you (or someone) describe exactly what Golem does, and provide an
> example of its output format?
Okay, so based on lvova's presentation, it seems like it does this:
* Build a graph of Wikipedia articles in the main namespace, with
wikilinks as vertexes. Since some pages are not reachable from other
pages, this is actually N disconnected graphs.
* Remove all edges which refer to disambiguation pages, date pages, or
lists
* Remove the graph which contains the main page
* Produce a list of all remaining graphs.
Is that roughly correct?
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (HP-UX)
iEYEARECAAYFAkuzaWYACgkQIXd7fCuc5vKQ6QCfQuzTrBIUk4XRqKzwhqrzC9Tg
5DgAn0kLgC/0gkDiVS0M66WMiMSlhnvT
=VrXF
-----END PGP SIGNATURE-----
More information about the Toolserver-l
mailing list