Cebuano and Waray are definitely outliers because they're bot-Wikipedias, 70 and 95 % articles bot created respectively. http://stats.wikimedia.org/EN/BotActivityMatrixCreates.htm sv should soon reach about 75 % bot creations and nl is rather stable around 50-60 %, so that explains most weird clusters. For your left to right ordering "by size", you should use "Usage" rather than number of articles, because when they differ too much there's something wrong. http://stats.wikimedia.org/EN/Sitemap.htm For instance, of the top 11-20 Wikipedias by number of articles only 2 are in the official www.wikipedia.org top 20 (which is by usage).
Nemo