Neta Livneh, 18/01/2015 19:57:
I think this is a better version.
Thanks. I think the way to read this graph is that it's naturally darker below the diagonal line, and fairer above it. In fact, position (x, y) is the percentage of articles in wiki x which also exist in wiki y. If y > x we can't reach 100 %; for y >> x, we approach zero. So, the things worth noting are mostly the dark areas above the line and white areas below the line. Well known botpedias (ceb and war) clearly stand out. At a lesser extent also nl, sv. If you ordered the wikis by pageviews (as per www.wikipedia.org top 10) the shade would look more natural (but we'd lose information, unless you redefined the colouring). A non-mystery is the strong correlation between sh and sr: that's basically the same language and they have a similar size. A weird thing is the status of "min": you'd expect it to have some stronger correlation to zh; I'd call that a gap to fill. The horizontal lines for ja, vi also stand out: we rarely see users from those wikis, they're more isolated. The vertical lines above (uz, vo) come often with surprises: probably some common bulk of bot-created articles. The dark spots in the vertical line above pms is an antology of secessionist/regional/nostalgic languages; not a surprise given the interests of the core editors.
Nemo