Very cool, nice work, especially with the interwiki integration. I've messed around a bit with viewing mediawiki link graphs using Cytoscape, software designed for visualizing molecular interactions. But only a _very_ modern machine (I'm talking about a maxed out Mac Pro here :) could handle the english wikipedia and certainly not the entire interwiki link graph as you have done. But you can do lots of different kinds of visualizations with it. Here's a good description:
http://www.mkbergman.com/?p=415
I also attached a Cytoscape file that will let you visualize Scholarpedia. In case the list scrubs it: http://filebin.ca/zmtoy/scholarpedia.cys
(not to segue your thread - i thought you would find it interesting :)
Cheers, Brian
On Mon, Mar 17, 2008 at 4:46 PM, Lukasz Bolikowski bolo@icm.edu.pl wrote:
Hi,
I've written a visual tool for analyzing the graph of interlanguage links between all 256 editions of Wikipedia.
Its main advantages, compared to bots, are:
- it analyzes the whole inconsistent component
at once, while bots tend to work "locally" (in some neighborhood of an article);
- cool (IMHO) graph visualization;
- concrete recommendations: remove a link, split
an article, merge articles, remove redirects.
To stress the advantage of "global" vs. "local" analysis of a component: the largest connected component in the graph contains over 48'000 articles, mixing over 2'500 different subjects. Some of the sources of semantic drift in such components are not visible "locally".
Main disadvantages:
- works on preprocessed dumps, instead of "live"
Wikipedia, so the recommendations may be outdated;
- (for the moment) does not recognize some of the
redirects, due to poor quality of redirect dumps. Apparently I'm not the only one affected by the problem, and the guys at wikitech-l are aware of the issue;
- Requires Java 6, eats a lot of resources (512M
seems to be enough even for the largest case);
- Doesn't change anything (points to the possible
sources of problems instead).
The tool is far from being complete, "prototype" would be a more appropriate name here (its original purpose was to help me evaluate some ideas for my PhD). Please try it and send me your feedback, I'd like to make it more useful for the community.
You can find the tool here: http://wikitools.icm.edu.pl/
Regards, Bolo1729
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l