On Mon, Jul 22, 2002 at 11:56:58PM +0100, Neil Harris
wrote:
Tomasz Wegrzanowski wrote:
Topology of Wikipedias is very interesting.
First question is: what is distribution of number of hops needed to
reach an article from the Main page.
Attached script gives aproximate answer to this question.
It requires PHP database, and libmysql-ruby.
Data for Polish Wikipedia:
-1 602 (12.75964392%)
0 1 (0.02119542179%)
1 113 (2.395082662%)
2 886 (18.7791437%)
3 2367 (50.16956337%)
4 600 (12.71725307%)
5 126 (2.670623145%)
6 16 (0.3391267486%)
7 5 (0.1059771089%)
8 2 (0.04239084358%)
Total 4718
Interesting. The English-language Wikipedia claims only 313 orphans (<
1%) out of 34457 articles, not counting redirects or non-comma articles.
Maybe there is a 'closure' effect as the encyclopedia gets bigger? Or
maybe 'real' articles are more likely to be linked?
Orphans count is different. Orphans count is 175 on Polish Wikipedia.
Orphans count doesn't include redirects, empty, user and talk pages.
That's good.
But if some group of articles link to each other but are not linked
from any article outside of the group, then orphan count doesn't
include them. But they're also not accesible, so it should.
Oh, I see. They're disconnected sub-graphs not reachable from the root.
That's interesting.
I wonder what the equivalent figures are for the English-language Wikipedia?
Neil