[Toolserver-l] Golem issues

River Tarnell river.tarnell at wikimedia.de
Wed Mar 31 15:25:26 UTC 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

River Tarnell:
> Could you (or someone) describe exactly what Golem does, and provide an
> example of its output format?

Okay, so based on lvova's presentation, it seems like it does this:

* Build a graph of Wikipedia articles in the main namespace, with
  wikilinks as vertexes.  Since some pages are not reachable from other
  pages, this is actually N disconnected graphs.
* Remove all edges which refer to disambiguation pages, date pages, or
  lists
* Remove the graph which contains the main page
* Produce a list of all remaining graphs.

Is that roughly correct?

	- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (HP-UX)

iEYEARECAAYFAkuzaWYACgkQIXd7fCuc5vKQ6QCfQuzTrBIUk4XRqKzwhqrzC9Tg
5DgAn0kLgC/0gkDiVS0M66WMiMSlhnvT
=VrXF
-----END PGP SIGNATURE-----



More information about the Toolserver-l mailing list