On Sun, Dec 13, 2009 at 10:30 AM, Nikola Smolenski smolensk@eunet.rs wrote:
Дана Saturday 12 December 2009 17:41:44 jamesmikedupont@googlemail.com написа:
On Sat, Dec 12, 2009 at 5:32 PM, Teofilo teofilowiki@gmail.com wrote:
Do we have an idea of the energy consumption related to the online access to a Wikipedia article ? Some people say that a few minutes long search on a search engine costs as much energy as boiling water for a cup of tea : is that story true in the case of Wikipedia (4) ?
my 2 cents : this php is cooking more cups of tea than an optimized program written in c.
But think of all the coffee developers would have to cook while coding and optimizing in C!
But that is a one off expense. That is why we programmers can earn a living, because we can work on many projects. Also we drink coffee while playing UrbanTerror as well.
1. Php is very hard to optimize. 2. The mediawiki has a pretty nonstandard syntax. The best that I have seen is the python implementation of the wikibook parser. But given that each plugin can change the syntax as it will, it will get more complex. 3. Even python is easier to optimize than php. 4. The other questions are, does it make sense to have such a centralized client server architecture? We have been talking about using a distributed vcs for mediawiki. 5. Well, now even if the mediawiki is fully distributed, it will cost CPU, but that will be distributed. Each edit that has to be copied will cause work to be done. In a distributed system even more work in total. 6. Now, I have been wondering anyway who is the benefactor of all these millions spend on bandwidth, where do they go to anyway? What about making a wikipedia network and have the people who want to access it pay instead of having us pay to give it away? With these millions you can buy a lot of routers and cables. 7. Now, back to the optimization. Lets say you were able to optimize the program. We would identify the major cpu burners and optimize them out. That does not solve the problem. Because I would think that the php program is only a small part of the entire issue. The fact that the data is flowing in a certain wasteful way is the cause of the waste, not the program itself. Even if it would be much more efficient and moving around data that is not needed, the data is not needed.
This would eventually lead, in an optimal world to updates not even being distributed at all. Not all changes have to be centralized. Lets say that there is one editor who would pull the changes from others and make a public version. That would mean that only they would need to have all data for that one topic. I think that you could optimize the wikipedia along the lines of data travelling only to the people who need it (editors versus viewers) and you would optimize first a way to route edits into special interest groups and create smaller virtual subnetworks of the editors CPUs working together in a peer to peer direct network.
So if you have 10 people collaborating on a topic, only the results of that work will be checked into the central server. the decentralized communication would be between fewer parties and reduce the resources used.
see also : http://strategy.wikimedia.org/wiki/Proposal:A_MediaWiki_Parser_in_C
mike