Hi all.. In the Arabic Wikipedia (as with other Wikipedias) we suffer from poor speed/performance, and a feature-stripped mediawiki (search-disabled and so).. Anyhow, one of the members in the Arabic Wikipedia contacted me, telling me that he can get Server power, and cover yearly costs for Bandwidth/placement in a data-center.. But he said that he wanted to have that server/bandwidth dedicated to the Arabic Wikipedia..
I wanted to know about the possibility of that.. is it "Technically" possible ? of course, everything is under the Mediawiki foundation umbrella, and server full access will be only for the mediawiki developers in charge of such thing..
If that is absolutely not possible.. will it be possible to have a new server added to the existing server farm, having it's cost covered, but enabling all the nice feature for Arabic Wikipedia (after all we are only 650 articles or so), and have a certain priority for Arabic Wikipedia, while having the server doing what other servers do ?
The current performance/speed/status of Arabic Wikipedia (and other wikipedias) is not a very good one.. and sooner or later Language-limited donations will come, and they will be have to be dealt with.. after all, full control is, and will always be, in the hand of the mediawiki developers/admins .. the only difference is that the server layer will be differently organized in a way to allow to dedicate resources to wikipedias..
I hope I will get a clear response, as my previous attempts to reach for a response did not give me one..
Yours Isam Bayazidi Arabic Wikipedia Amman - Jordan
To reassure you, the French Wikipedia (and all Wikipedia, I think) are using "feature-stripped mediawiki" ;o)
I really hope all those features (search, SQL access, maintenance pages, etc.) will be re-enabled when new servers will arrive (when?).
By the way, I friend tell me that Google (a read-only site) started with about 5000 machines (actually more than 10 times more). What it is our near/far future plane?
Aoineko
----- Original Message ----- From: "Isam Bayazidi" isam@bayazidi.net To: wikipedia-l@Wikimedia.org Sent: Saturday, May 08, 2004 7:57 AM Subject: [Wikipedia-l] language limited funding/donation
Hi all.. In the Arabic Wikipedia (as with other Wikipedias) we suffer from poor speed/performance, and a feature-stripped mediawiki (search-disabled and so).. Anyhow, one of the members in the Arabic Wikipedia contacted me, telling me that he can get Server power, and cover yearly costs for Bandwidth/placement in a data-center.. But he said that he wanted to have that server/bandwidth dedicated to the Arabic Wikipedia..
I wanted to know about the possibility of that.. is it "Technically" possible ? of course, everything is under the Mediawiki foundation umbrella, and server full access will be only for the mediawiki developers in charge of such thing..
If that is absolutely not possible.. will it be possible to have a new server added to the existing server farm, having it's cost covered, but enabling all the nice feature for Arabic Wikipedia (after all we are only 650 articles or so), and have a certain priority for Arabic Wikipedia, while having the server doing what other servers do ?
The current performance/speed/status of Arabic Wikipedia (and other wikipedias) is not a very good one.. and sooner or later Language-limited donations will come, and they will be have to be dealt with.. after all, full control is, and will always be, in the hand of the mediawiki developers/admins .. the only difference is that the server layer will be differently organized in a way to allow to dedicate resources to wikipedias..
I hope I will get a clear response, as my previous attempts to reach for a response did not give me one..
Yours Isam Bayazidi Arabic Wikipedia Amman - Jordan _______________________________________________ Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
On Saturday 08 May 2004 04:46, Guillaume Blanchard wrote:
By the way, I friend tell me that Google (a read-only site) started with about 5000 machines (actually more than 10 times more). What it is our near/far future plane?
Wikipedia's model is not a commercial one and I assure you if a "Powered by HP or IBM" is possible then you will get loads of server, as Isam said.
I like the fact Wikepedia is banner free and I hope it will stay this way. This means Wikepedia will depend on support of believers of the project.
Regards,
On Sat, May 08, 2004 at 10:46:31AM +0900, Guillaume Blanchard wrote:
By the way, I friend tell me that Google (a read-only site) started with about 5000 machines (actually more than 10 times more). What it is our near/far future plane?
Google is NOT read-only after it crawls the whole world wide web. A search engine is naturally a whole different and more distributable (see grub) thing that a database-centric wiki.
ciao, tom
Hello!
On Sat, 8 May 2004 10:30:43 +0200, Thomas R. Koll wrote:
Google is NOT read-only after it crawls the whole world wide web. A search engine is naturally a whole different and more distributable (see grub) thing that a database-centric wiki.
There is a company namend Akamai out there, and as far as I know, they distribute websites like CNN and Microsoft, which are (especially the first) updated frequently, and have some feedback features etc. So theoretically it might be possible to use distributed servers for the WP. And in the long term it might be a good idea, too; all the Wikipedias are growing fast, and have users from all over the world.
However, my guess is also that this would require *a lot* of rewriting of the software, so at best this can be a long-term option. But not one I'd dismiss right away.
Greetings from Colgone Alex
Alex Regh wrote:
However, my guess is also that this would require *a lot* of rewriting of the software, so at best this can be a long-term option. But not one I'd dismiss right away.
Do I understand from this that the Mediawiki software have a limitation in that regard ? I thought that there is already somehow a distribution for the servers between English and non-English wikipedias.. as sometimes English work, while other langs don't.. and in other times non-English work fine, while English is not..
Does all Wikipedias currently share the same resources with equal priorities ? or there is already certain setups that are made for Large wikipedias such as English and German ?
Yours Isam Bayazidi
Isam Bayazidi wrote:
Do I understand from this that the Mediawiki software have a limitation in that regard ? I thought that there is already somehow a distribution for the servers between English and non-English wikipedias.. as sometimes English work, while other langs don't.. and in other times non-English work fine, while English is not..
Does all Wikipedias currently share the same resources with equal priorities ? or there is already certain setups that are made for Large wikipedias such as English and German ?
Everything shares the same servers. The only partial exception to this at present is the front-end squid cache machines, which are roughly divided between English Wikipedia (plus a couple of small wikis) and Everything Else (including the couple of small wikis on the other machine). Sometimes these are served by the same machine during maintenance, or which machine is which may swap.
The only reason there's a split right now is that there was something weird with our DNS service that doesn't let us declare both IP addresses for the *.wikipedia.org hosts. This should change when we move over to our own DNS server, which was supposed to happen months ago but is waiting on... something? I'm not sure of the details. I think the secondary DNS backup server has to get set up somewhere.
-- brion vibber (brion @ pobox.com)
Isam Bayazidi wrote:
Alex Regh wrote:
However, my guess is also that this would require *a lot* of rewriting of the software, so at best this can be a long-term option. But not one I'd dismiss right away.
Do I understand from this that the Mediawiki software have a limitation in that regard ?
Technically, we could set up every wiki on a separate server of its own. However this would be more difficult to administer, and would require 300 servers in its strictest interpretation. ;) Since our resources are limited, we want to provide as much as we can to all wikis rather than favoring some, though generally we've put a lot more restrictions on the English Wikipedia because it's the biggest and takes the most resources.
I don't know any details about Akamai (which is what Alex was specifically talking about), but I believe it's some sort of distributed caching system. Our biggest problems are on the database and apache servers that do the work of dynamic page generation; more front-end caches would be less helpful.
-- brion vibber (brion @ pobox.com)
Hi!
On Sat, 08 May 2004 04:05:14 -0700, Brion Vibber wrote:
I don't know any details about Akamai (which is what Alex was specifically talking about), but I believe it's some sort of distributed caching system. Our biggest problems are on the database and apache servers that do the work of dynamic page generation; more front-end caches would be less helpful.
I don't know how they work, but I guess they have to have some mechanism for the feedback, too. A distributed cache for reading would probably be helpfull already, and in the long term it might make sense to distribute the databases, too. After all, why should chinese and arabian (for example) entries have to cross an ocean to get edited, those WPs are probably read there a lot more than in America or Europe, too.
On the other hand, these are very long-term considerations; right now we should concentrate on getting the hardware running we have and/or will get soon.
Greetings from Cologne Alex
On Sat, 2004-05-08 at 14:47 +0200, Alex Regh wrote:
I don't know how they work, but I guess they have to have some mechanism for the feedback, too. A distributed cache for reading would probably be helpfull already, and in the long term it might make sense to distribute the databases, too. After all, why should chinese and arabian (for example) entries have to cross an ocean to get edited, those WPs are probably read there a lot more than in America or Europe, too.
More caching is planned for logged-in users and caching parts of a page using ESI (http://www.esi.org) which is the main technology used and partly developed by Akamai. Squid3 supports it, the new xhtml is structured to simplify this, see the html comments at http://test.wikipedia.org for some info on this.
There's still heaps of work to be done to ESI-enable MediaWiki: * providing an efficient way to do the partial views * purging parts * maybe translating templates to ESI snippets as well to make the easily purgeable * Squid3 with esi/epoll stability testing)
Anybody wishing to help is welcome of course- the more hands, the quicker this will happen.
Alex Regh wrote:
After all, why should chinese and arabian (for example) entries have to cross an ocean to get edited, those WPs are probably read there a lot more than in America or Europe, too.
I don't understand this opposition to the ocean? ... It doesn't make it any slower, you know.
Timwi
Hi!
On Mon, 10 May 2004 01:46:16 +0100, Timwi wrote:
I don't understand this opposition to the ocean? ... It doesn't make it any slower, you know.
Depends a lot on the connection; some servers in the US are quite a lot slower in Europe, or some ISPs in Europe, and my guess is that this is not all that different from other parts of the world.
However, I consider this a long-term question, anyway, because right now it's not the lines that are the bottleneck.
Greetings from Cologne Alex
Alex Regh wrote:
On Mon, 10 May 2004 01:46:16 +0100, Timwi wrote:
I don't understand this opposition to the ocean? ... It doesn't make it any slower, you know.
Depends a lot on the connection; some servers in the US are quite a lot slower in Europe, or some ISPs in Europe, and my guess is that this is not all that different from other parts of the world.
I do not have the impression that European servers tend to be faster for me than American (or Asian or Australian) servers (haven't tried many African ones...). The number of hops is approximately the same no matter what. The only servers that are noticeably faster for me are those within the University, but that should be obvious.
Additionally, notice that Google seems to be entirely US-based too (judging from traceroutes). google.co.jp, google.de, google.fr, etc. all seem to have the same rotating set of IPs. And I'm pretty sure they have quite an order of magnitude more traffic than Wikipedia. :)
Timwi
Alex Regh wrote:
Hello!
On Sat, 8 May 2004 10:30:43 +0200, Thomas R. Koll wrote:
Google is NOT read-only after it crawls the whole world wide web. A search engine is naturally a whole different and more distributable (see grub) thing that a database-centric wiki.
There is a company namend Akamai out there, and as far as I know, they distribute websites like CNN and Microsoft
and (part of) LiveJournal ;-)
, which are (especially the first) updated frequently, and have some feedback features etc.
I can't claim any expertise on this, but as far as I understood it, Akamai specialises on serving static content. In the case of LiveJournal, this is mainly the user pictures - if you delete one and upload a new one, it will get a new URL, but if you edit an entry, it still has the same URL, so LiveJournal entries are saved on the central DB and not on Akamai.
On Wikipedia nothing is immutable under a given URL (not even images), except perhaps for small things like the stylesheet, but they are not worth it.
So theoretically it might be possible to use distributed servers for the WP.
No, I don't think that would be possible.
And in the long term it might be a good idea, too; all the Wikipedias are growing fast, and have users from all over the world.
Fast growth and international users are both no reason to employ distributed servers. They are, however, a reason to employ new and good hardware and to improve/optimise the software. :-)
Timwi
Timwi wrote:
Alex Regh wrote:
<snip>
There is a company namend Akamai out there, and as far as I know, they distribute websites like CNN and Microsoft
and (part of) LiveJournal ;-)
, which are (especially the first) updated frequently, and have some feedback features etc.
I can't claim any expertise on this, but as far as I understood it, Akamai specialises on serving static content. In the case of LiveJournal, this is mainly the user pictures - if you delete one and upload a new one, it will get a new URL, but if you edit an entry, it still has the same URL, so LiveJournal entries are saved on the central DB and not on Akamai.
On Wikipedia nothing is immutable under a given URL (not even images), except perhaps for small things like the stylesheet, but they are not worth it.
<snip>
Hello,
Akamai own servers at almost every single ISPs hosting rooms. A customer (yahoo, amazon, cnn ...) give akamai the content they want to be serve mainly pictures and video. Akamai put them on ALL of their server.
Then the customer need to change it's images / videos domains url to something like : d304-a5ea-custname.akadns.net or a custom domain name. When someone want the picture, the dns entry will be resolved by the akamai dns servers to the nearest server available. As akamai got servers everywhere, most of the time the user will download heavy content from a server directly at is ISPs instead of oversea / far.
This service does cost A LOT of money, and I don't think that's the point right now.
Simple example: pics.ebaystatic.com is a cname to a1654.g.akamai.net wich resolve for me to 193.45.10.72 (hosted at telia, the main peer of my isp) but it resolve to 212.155.193.143 (hosted at uunet fr for the uunet dns server). So I download ebay pictures from telia, while uunet france customers download pictures from uunet :o)
Sorry if it was too technical.
cheers,
Ashar Voultoiz wrote:
Akamai own servers at almost every single ISPs hosting rooms. A customer (yahoo, amazon, cnn ...) give akamai the content they want to be serve mainly pictures and video. Akamai put them on ALL of their server.
[etc.etc.etc.]
I know all of that. You missed the point. The question was whether Akamai can be used to serve constantly-changing dynamic content, not how Akamai works.
Timwi
Isam Bayazidi wrote:
In the Arabic Wikipedia (as with other Wikipedias) we suffer from poor speed/performance, and a feature-stripped mediawiki (search-disabled and so)..
It is my belief that the speed/performance/features for all the mediawikis are currently equivalent. The only exception to that may be some latency due to sheer distance and the speed of light around the globe, but I believe that this is relatively minor.
(However, I do suppose that if a particular nation or area is not well-connected to the new, a local squid would boost them a fair amount.)
The best possibility, technologically, to do what your potential donor wants to do, is to host a squid in your area, on an experimental basis perhaps, so that we can see if it makes a positive difference.
The current performance/speed/status of Arabic Wikipedia (and other wikipedias) is not a very good one.. and sooner or later Language-limited donations will come, and they will be have to be dealt with..
I think that this is likely, i.e. that there will be language-limited donation offers. And we do need to discuss what to do about that.
We are a global project, and I would like to encourage donors to think globally as well. From the point of view of global efficiency, then for the most part, language-specific donations aren't going to help much -- the most efficient architecture is one which clusters, not one which divides.
At the same time, of course, I do recognize that donors care about what *they* care about, and frequently of course this would be their own language wikipedia. That's not a bad thing, it's only natural. And I don't want to discourage donors who want to help -- particularly not if the alternative is to have them fund a fork out of frustration.
THIS PART IS IMPORTANT -- $20,000 of new hardware is ordered and will be here within 2 week or so. This includes a more powerful database server with faster CPUs, faster hard drives. Our existing database server will still be here, so we will go from one strong machine to two strong machines.
Additionally, I've order 4 1U servers which will be adapted and customized for roles as apache's, squids, fileservers, whatever we need. I also hope that 1 or 2 of them will be sitting idle as "universal hot spares" so that we can quickly configure them to take on a role while we work on any other machines that are down.
--Jimbo
Jimmy-
At the same time, of course, I do recognize that donors care about what *they* care about, and frequently of course this would be their own language wikipedia.
We may solve this problem simply by requiring local branches to share all proceeds with the mother organization, 50/50.
THIS PART IS IMPORTANT -- $20,000 of new hardware is ordered and will be here within 2 week or so.
That sounds good! Could you send an update to Mav about our current state of funds?
Also, are we still pursuing a refund or replacement for Geoffrin, or has that already been addressed?
Regards,
Erik
Jimmy Wales wrote:
(However, I do suppose that if a particular nation or area is not well-connected to the new, a local squid would boost them a fair amount.)
The best possibility, technologically, to do what your potential donor wants to do, is to host a squid in your area, on an experimental basis perhaps, so that we can see if it makes a positive difference.
squid will help when there is more reading than writing .. for the Wikipedia.. for small Wikis, probably there is more writing than reading.. with editing and writing, squid won't have much effect I guess (unless I don't fully understand the purpose of squid)
At the same time, of course, I do recognize that donors care about what *they* care about, and frequently of course this would be their own language wikipedia. That's not a bad thing, it's only natural. And I don't want to discourage donors who want to help -- particularly not if the alternative is to have them fund a fork out of frustration.
If it is technically possible, and the funding cover all costs of it, it will be great to open such door, as Wikipedians will start to look for local donors who may be interested to support that language Wikipedia, and probably, this will help to make the "global" resources and donations more effective in terms of resource sharing.
THIS PART IS IMPORTANT -- $20,000 of new hardware is ordered and will be here within 2 week or so. This includes a more powerful database server with faster CPUs, faster hard drives. Our existing database server will still be here, so we will go from one strong machine to two strong machines.
Additionally, I've order 4 1U servers which will be adapted and customized for roles as apache's, squids, fileservers, whatever we need. I also hope that 1 or 2 of them will be sitting idle as "universal hot spares" so that we can quickly configure them to take on a role while we work on any other machines that are down.
I am no expert here, but in simple language, how much boost in the speed Wikipedia will have ? 10% ? 50% ? more ? and will those additions allow things like the "Search" to be enabled ? or "Search" is resource eating in a way that we will not see it back any time soon ?
Yours Isam Bayazidi
On Sat, 2004-05-08 at 23:04 +0300, Isam Bayazidi wrote:
squid will help when there is more reading than writing .. for the Wikipedia.. for small Wikis, probably there is more writing than reading.. with editing and writing, squid won't have much effect I guess (unless I don't fully understand the purpose of squid)
There's alway more reading than writing, but only anonymous requests are cached currently- it wouldn't help for editing right now.
At the same time, of course, I do recognize that donors care about what *they* care about, and frequently of course this would be their own language wikipedia. That's not a bad thing, it's only natural. And I don't want to discourage donors who want to help -- particularly not if the alternative is to have them fund a fork out of frustration.
If it is technically possible, and the funding cover all costs of it, it will be great to open such door, as Wikipedians will start to look for local donors who may be interested to support that language Wikipedia, and probably, this will help to make the "global" resources and donations more effective in terms of resource sharing.
An additional apache or (even more useful) another DB would help of course.
However, i think that there's a lot of potential left in caching (roughly 1/3 of all requests still aren't cached while the number of edits should be something like 2% of all requests). See my earlier post about this. Something like sponsored bounties for bigger tasks could motivate developers to spend more time on these things.
Also the DB needs a decent 64bit Kernel to use the full RAM, something that's impossible to do without a second DB.
I am no expert here, but in simple language, how much boost in the speed Wikipedia will have ? 10% ? 50% ? more ? and will those additions allow things like the "Search" to be enabled ? or "Search" is resource eating in a way that we will not see it back any time soon ?
It's expensive for the DB, it might be ok to enable it again when both DB's are set up, although sooner or later the growth will eat that performance as well. It seems to be a bit like highways- the more you build, the more are needed..
wikipedia-l@lists.wikimedia.org