This is excellent. We desperately need this on wikiesource and wikibooks. You might have saved us from a lot of trouble.
On Tue, Mar 18, 2008 at 12:46 AM, Lukasz Bolikowski bolo@icm.edu.pl wrote:
Hi,
I've written a visual tool for analyzing the graph of interlanguage links between all 256 editions of Wikipedia.
Its main advantages, compared to bots, are:
- it analyzes the whole inconsistent component
at once, while bots tend to work "locally" (in some neighborhood of an article);
- cool (IMHO) graph visualization;
- concrete recommendations: remove a link, split
an article, merge articles, remove redirects.
To stress the advantage of "global" vs. "local" analysis of a component: the largest connected component in the graph contains over 48'000 articles, mixing over 2'500 different subjects. Some of the sources of semantic drift in such components are not visible "locally".
Main disadvantages:
- works on preprocessed dumps, instead of "live"
Wikipedia, so the recommendations may be outdated;
- (for the moment) does not recognize some of the
redirects, due to poor quality of redirect dumps. Apparently I'm not the only one affected by the problem, and the guys at wikitech-l are aware of the issue;
- Requires Java 6, eats a lot of resources (512M
seems to be enough even for the largest case);
- Doesn't change anything (points to the possible
sources of problems instead).
The tool is far from being complete, "prototype" would be a more appropriate name here (its original purpose was to help me evaluate some ideas for my PhD). Please try it and send me your feedback, I'd like to make it more useful for the community.
You can find the tool here: http://wikitools.icm.edu.pl/
Regards, Bolo1729
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l