Lars Aronsson schrieb:
When I asked in what way, if any, the Swedish chapter
could help
the toolserver project, the short answer I got was that a new
server would cost 8000 euro and a year's salary for an admin would
cost even more. If we had 8000 euro to spend (we don't) and
nobody asked us what we use the money for, maybe we could get away
with that. But when we ask people for donations, they like to
know what it's good for. So what is the toolserver good for,
really? I think I know this, but how can I explain it to others?
Find some tools people are using a lot. Wiki MiniAtlas, CatScan, CheckUsage,
SulTool, etc.
The annual report (Tätigkeitsbericht) for 2007 for
Wikimedia
Deutschland mentions that 60,000 euro was used to double squid
capacity and 10,000 euro was used for the toolserver cluster.
The report for 2008 should be due soon, I guess.
It's due soon, yes. I wrote it. It doesn't have much technical details -
basically, it sais that the investments in squids payed off, and we won't need
new squids until the end of the year. About the toolserver it says that we got
one DB server per wiki cluster now, extended web server power to match demands,
and have moved batch jobs and bots to a separate server (which is nearly
overloaded already, btw).
I think it could be useful if the toolserver project
could find a
way to explain its current capacity, what kind of services this
capacity is used for, and how much this capacity could be
increased by a certain donation. We could make a little folder or
poster that explains this, and distribute among our fans.
Who would? This is a lot of work. It would sure be nice to have... and maybe it
would also lead people to donate. But I don't know if dead wood is really so
useful. Perhaps let some of our tools let people know more directly that they
are using the toolserver.
The easiest would perhaps be to count the HTTP
requests. How many
requests (or page views) does the toolserver receive per year,
day, or second? How are they distributed over the various
services? Is it possible to count how many unique visitors (IP
addresses) each service has? Divided by language or referer
language of Wikipedia? Could the toolserver produce something
similar to Domas' hourly visitor statistics for Wikipedia?
Access logs for the last 7 days are in /var/log/http on wolfsbane. Webalizer
reports for this should be at <http://toolserver.org/~daniel/stats/> but they
appear to be broken - webalizer seems to have problems with very large files.
Stats for the old apache server are in
<http://toolserver.org/~daniel/stats-old/>. I only now remembered to change the
stats to look at the ZWS logs - damn, I was sure I had done that months ago. Sorry.
There's already
http://toolserver.org/%7Einteriot/cgi-bin/tstoc
where the right-most column indicates the number of individual IP
adresses in the last 7 days. Is there a log of these data? It
must be wrong (surely?) that only 8 tools had any accesses this
week. It could be correct that "21933 unique IP's visited the
toolserver this week", but that number sounds rather low.
I suspect it's looking at the Apache logs, instead of ZWS logs. Or it's simply
bropken, who knows.
The toolserver hopefully spends far more processor
cycles for each
HTTP request than a squid cache. But just how many more? Every
server has an economic life of maybe 3 years, while its purchasing
price is written off, and it serves a number of HTTP requests in
that time. So what is the hardware (and hosting) cost of each
request?
Hard to tell, depends on what you include. Do you include resources used for
bots in the cost of web requests?
Of course, HTTP requests is not everything. Is there
a way we
could measure other uses of the toolserver?
We could look at the number of bot jobs running on rosemary -
http://ganglia.toolserver.org/ is intended to provide this information, but it's
broken because of some technical nastynes.
Getting back to my favorite: The geographic
applications. Are the
map tiles for the WikiMiniAtlas served from the toolserver?
They are served from the stable cache on willow, aka
stable.toolserver.org.
How
much power does this currently use for the Swedish Wikipedia?
Hard to tell - how would we know if something is loaded "for" the Swedisch
Wikipedia? But you can look at the apache logs on stable, i suppose.
What
if the maps were shown inline for every visitor, instead of as a
pop-up that very few people click on? How much extra would that
cost?
We can only guess. But note that just today, a cooperation project with
OpenStreetMap was launched to make dynamic, zoomable inline maps in wikipedia
happen. They will get extra hardware for that.
Could such a service still run on the toolserver, or
would
it need to be moved to the servers in Florida?
It doesn't really matter if it's in florida or amsterdam. But it would need
dedicated hardware. When this goes full scale, I expect to see the tiles stored
in both places, probably on our big file serfvers that also contain all other
images.
-- daniel