Hi all,
sorry if this has been brought up before (I'm not reading wikitech-l on a regular basis), but I just got an idea for reducing the server load at Wikipedia without messing around with distributed databases and the like: Why not distribute static images and media files for starters?
http://de.wikipedia.org/upload/9/9a/WikiReader_Internet.pdf alone will be responsible for more than 5 GB of transfer volume this month (a guess based on my experience with http://tkarcher.gmxhome.de/WikiReader_Schweden.pdf ). With more and more images and large media files on the rise, we could noticeably reduce the server load by distributing these files.
How? Well, first we'd need some (5 to 10?) mirrors willing to sync their data on a daily basis. Once these servers are up and running, we could implement a simple, PHP-driven load balancing by changing the image (and media file) source locations dynamically. (Script driven as opposed to server driven, because we'd have to check whether the file was added within the last 24 hours. You wouldn't want to wait for your newly added image to appear in your article a whole day, would you?)
What do you think about it?
Thomas
T_Karcher@gmx.de wrote:
With more and more images and large media files on the rise, we could noticeably reduce the server load by distributing these files.
How? Well, first we'd need some (5 to 10?) mirrors willing to sync their data on a daily basis.
I thought we had discussed this before. It was decided that it would make way more sense to use a ready-made solution (Akamai) than to make one's own, but it was also decided that a change to the software would first be necessary. Currently, a tag like this:
[[Image:File_name.jpg]]
produces an <img> tag directly from just the filename:
<img src="/upload/archive/e/2/File_name.jpg">
where the "e" and the "2" depend only on the filename.
The problem with this is that these paths do not change if the image changes. This means that overriding an image by uploading a new one to Wikipedia would still serve the old version of the image from Akamai until Akamai is updated.
Hence, a change to the source code is necessary that makes the file path dependent not just on the filename, but also on either the file itself or the time it was uploaded; however, this increases the load on the database because it means an extra query for every pageview that contains an image.
Greetings, Timwi
wikitech-l@lists.wikimedia.org