Thanks for the reply Jimmy.
I can imagine how it is very important to get the figures right, and with all the different data we are seeing, the maxim "lies, damn lies and statistics" never seemed truer!
1) I absolutely agree that I shouldn't have concentrated only en. Overall growth is higher than en growth
2) I would like to be able to trust the webalizers stats more. The figures seem do seem to be a bit all over the place. The following annualized figures show the different growth rates for the different statistics from February to July
By hits: 999% annually By files: 1095% annually By pages: 257% annually By visits: 9% annually
The same figures from March to July
By hits: 1049% annually By files: 996% annually By pages: 860% annually By visits: -6% annually
I do not know how to reconcile the hits/files/pages with the visits figures. I guess you covered that with your chat with the developers yesterday?
Over the same period of time, the number of active wikipedians (i.e. wikipedians making 5 edits or more) is, as we agree, roughly stable (since the huge jump at the end of February). ''If'' we make the assumption that no. of wikipedians is constantly proportional to the number of visitors, then these growth rates (>800%) can not be true. Now it stands to reason that over time, as the encyclopedias mature, the visitor to editor ratio should increase. But by that much? By instincts say no.
Alexa has problems to be sure. Looking at a spread of sites, it appears that virtually all English language sites are going down, at the expense of a mix of sino-japanese sites. This is likely to reflect how the alexa toolbar is distributed more than anything else. In that sense WP may be doing extremely by staying more or less constant for the last three or four months (that does appear to be the case even taking into logarithms - the +86 appears to be the change in the last three months average against the previous three months - i.e. comparing Jan-March to April-June which seems ok), WP is "swimming against the tide" and doing well.
Key conclusion: We still have growth, it is rapid, it is still limited by server capacity more than a limit to natural demand. However the 90% per quarter (1300% per year) estimate seems too excessive.
Pete
p.s. I agree with the other posters that it is great (in one sense) that the mirrors are "shouldering the burden". Would "360 million pageviews per month, probably at least half a billion when other users of our free content are taken into account, extremely confident it will be a billion by Christmas" be snappy enough for journalists?
Jimmy (Jimbo) Wales wrote:
Pete, I think you're not reading the evidence correctly, but I think that we need to make sure we get this right. Forecasting the future is tricky enough, and we can only do it well if we understand the past.
So I invite real scrutiny of all these numbers, because I really need to know the right answer.
- The Webalizer pages: (http://wikimedia.org/stats/en.wikipedia.org/)
Look at pageviews per day... February 2004 3.01 million June 2004, 5.28 million
For the intervening months, there are significant problems with the averages due to some missing data. (For example, several days in April are in the dataset and part of the denominator but clearly have missing data, for example showing 94 pageviews in a day.)
Additionally, it is a very big mistake when talking about growth to focus solely on en. En is not the fastest growing wikipedia, in terms of traffic percentages.
You can see the growth better here: http://www.wikipedia.org/wikistats/EN/TablesUsagePageRequest.htm
Notice in particular the dramatic spike in June which is continuing unabated into July. This spike, I believe, corresponds with the acquisition of new servers.
- The Wikipedia Statistics pages
(http://www.wikipedia.org/wikistats/EN/TablesWikipediaEN.htm)
A better page (to get the global perspective) is here: http://www.wikipedia.org/wikistats/EN/TablesWikipediansNew.htm
It does seem true that the number of new wikipedians peaked in March, but notice the big spike in de.wikipedia which distorts the statistic to some degree. en has experienced similar spikes before, and the subsequent decline after a spike was not indicative of the long term trend, which continues generally to be strongly positive.
- Alexa
- see the following graph
http://www.alexa.com/data/details/traffic_details?&range=6m&size=lar... which shows Wikipedia has not grown in the last four/five months.
Given that we know traffic has more than doubled in that time frame, this bears taking a closer look.
I think this picture is more informative: http://www.alexa.com/data/details/traffic_details?&range=2y&size=lar...
Here, you have to remember that the chart is logarithmic, *and* that increases in rank require different increases in traffic as different scales. That is, to go from #10,000 to #1,000 is less of an achievement than to go from #1,000 to #100. (Almost any website can spike into the top 1,000 if it gets press coverage of the right sort.)
Look at that picture and recognize that if we blipped up to #1, it would look like a minor blip, due to the logarithmic nature of the chart.
Alexa shows our 3 mos change as being +86 (in rank), and since we are already in the ballpark of 600, that representents a substantial traffic increase.
And finally, Alexa numbers are fun to look at, but they do not accurately represent real traffic numbers. If you look up Bomis on there, you'll see a precipitous decline -- but from our perspective (at Bomis), traffic was stable for that same time period. (Of course being stable while everyone else is growing is one possible explanation, but another explanation is that Alexa stats are questionable anyway.)
After a chat with the developers yesterday, I'm comfortable (for now) with the number of 360 million pageviews per month. If that's wrong, I need to know soon, because I'm going to say that number in public.
--Jimbo