That will be good enough for some testing. Hopefully squids can be configured to understand that http://fr.paris.wikipedia.org/ is indeed http://fr.wikipedia.org/
French squids can be configured to use Florida squids as round-robin parents.
Anyway, I might volunteer for setting whole squid thing up. We're dealing with recursive squid topologies here and there, so it would be quite interesting to do that for France as well.
Domas
On Wed, 2005-01-05 at 08:44 +0200, Domas Mituzas wrote:
That will be good enough for some testing. Hopefully squids can be configured to understand that http://fr.paris.wikipedia.org/ is indeed http://fr.wikipedia.org/
French squids can be configured to use Florida squids as round-robin parents.
They should be configured that way already, missing two of the current squids though. The purge is just more ips in the array. If it's not yet done i can help with the setup this weekend.
Gabriel Wicke wrote in gmane.science.linguistics.wikipedia.technical at Friday 07 Jan 2005 07:44:
On Wed, 2005-01-05 at 08:44 +0200, Domas Mituzas wrote:
That will be good enough for some testing. Hopefully squids can be configured to understand that http://fr.paris.wikipedia.org/ is indeed http://fr.wikipedia.org/
French squids can be configured to use Florida squids as round-robin parents.
They should be configured that way already, missing two of the current squids though. The purge is just more ips in the array. If it's not yet done i can help with the setup this weekend.
I'm not sure this is the most scalable way to do it. I spoke with Domas about this on IRC briefly and the best solution seems to be some kind of purge daemon which can handle purging without putting more strain on the apaches; or failing that, some kind of recursive purge which would allow the fl squids to purge the French squids themselves.
Doing it via MediaWiki is likely to cause slow edits & other problems if there are network issues between the two sites...
Kate.
Kate,
the problem is timing- the purge has to happen before the edited page is returned, otherwise the user might not see the new version.
The purge code writes to a socket and won't read from it until the next round, so with more squids in the list each squid has a tiny bit more time to process the request. If no socket can be established in the first place (network problem for example) the squid is taken out of the list for the main part of the function. The most important thing is that more squids increase the time needed for the purge only marginally. The code that reads from the socket:
while (strlen($res) < 100 && $esc < 200 ) { $res .= @fread($sockets[$s],512); $esc++; usleep(20); }
This means that it won't try to read from the socket forever, and the waiting adds up to 0.004 seconds. Not sure about the fread's timeout, iirc it's fairly low if it waits at all, i forget. Come to think of it, usleep should probably wait a bit longer in less cycles, not sure why i picked it that low- i think it also consumes a some cpu time in php. If the server returns slower than one round of writing to sockets we usually get his replies in the next round (we read up 512 bytes while the normal header returned is about 150 bytes). The network buffer would allow us to do about 80 purges without any reading, so there's enough room for the usual two purges per edit (page and history).
Additional pages (depending on the changed one) are purged after returning the page to the user (deferred update).
A while ago we had the german squid (really cheap hoster, was more down than up..) on the purge list without causing any problems.
Overall i believe more squids aren't much of a problem, and if slow return times get a problem we can still decrease the times it tries to read from the socket (worth a try anyway).
On Fri, 7 Jan 2005, Gabriel Wicke wrote:
The code that reads from the socket:
while (strlen($res) < 100 && $esc < 200 ) { $res .= @fread($sockets[$s],512); $esc++; usleep(20); }
This means that it won't try to read from the socket forever, and the waiting adds up to 0.004 seconds. Not sure about the fread's timeout, iirc it's fairly low if it waits at all, i forget. Come to think of it, usleep should probably wait a bit longer in less cycles, not sure why i picked it that low- i think it also consumes a some cpu time in php.
In my experience, usleep() cannot be shorter than a single time slice fromt the Linux scheduler, which for 2.6 kernels should be 1 millisecond.
So that code is waiting up to 200 milliseconds, with negligible CPU overhead, unless of course PHP does things differently. But to sleep less than a timeslice you need to busy sleep, so the CPU would be taxed at 100% in that case.
Alfio
On Sat, 2005-01-08 at 00:37 +0100, Alfio Puglisi wrote:
In my experience, usleep() cannot be shorter than a single time slice fromt the Linux scheduler, which for 2.6 kernels should be 1 millisecond.
So that code is waiting up to 200 milliseconds, with negligible CPU overhead, unless of course PHP does things differently. But to sleep less than a timeslice you need to busy sleep, so the CPU would be taxed at 100% in that case.
Alfio,
thanks for clearing that up. I set up the squids Friday evening, purges have been enabled to all three french machines since then without making a difference. Right now the first real traffic is starting to hit them, with dns managed by Mark using GeoDNS (sends fr.wikipedia.org visitors to the french squids if they have a french ip).
Stats: http://bleuenn.wikimedia.org:8080/ganglia/?c=Paris% 20cluster&m=&r=hour&s=descending&hc=4
On Sun, 9 Jan 2005, Gabriel Wicke wrote:
thanks for clearing that up. I set up the squids Friday evening, purges have been enabled to all three french machines since then without making a difference. Right now the first real traffic is starting to hit them, with dns managed by Mark using GeoDNS (sends fr.wikipedia.org visitors to the french squids if they have a french ip).
Stats: http://bleuenn.wikimedia.org:8080/ganglia/?c=Paris% 20cluster&m=&r=hour&s=descending&hc=4
Great. What's the plan for those squids? Will they only be used for fr.wikipedia or also for de, it, pl and other european languages?
Alfio
On Sun, 2005-01-09 at 21:27 +0100, Alfio Puglisi wrote:
Great. What's the plan for those squids? Will they only be used for fr.wikipedia or also for de, it, pl and other european languages?
They are getting de as well currently, Kate and Mark are testing it. #mediawiki is the place to follow the progress as it happens ;-)
Gabriel Wicke wrote in gmane.science.linguistics.wikipedia.technical:
On Sun, 2005-01-09 at 21:27 +0100, Alfio Puglisi wrote:
Great. What's the plan for those squids? Will they only be used for fr.wikipedia or also for de, it, pl and other european languages?
They are getting de as well currently, Kate and Mark are testing it. #mediawiki is the place to follow the progress as it happens ;-)
The French squids are currently serving fr.wikt, fr.wp and de.wp for French and German IPs. upload.wm and commons.wm DNS has been changed but due to a longer TTL (1 day) not much load from there is hitting them yet.
CPU load at the moment is around 80-90%, so it seems unlikely we'll be able to put en: there while maintaining a reasonable response time, but it may be feasible to move some of the other European languages. Current network usage is about 600Kbytes/sec, so that seems not to be an issue, however a configuration issue with Ganglia makes accurate monitoring a little harder.
The DNS is currently hosted offsite by blitzed.org, but will move to our servers once we have it set up and working. Due to being down 1 squid in Florida I was rather quicker to move some traffic ASAP without having to set things up here.
Kate.
(Update: We're about to move nl over, and redirect traffic from .be and .ch to the French servers.)
Alfio Puglisi wrote:
On Sun, 9 Jan 2005, Gabriel Wicke wrote:
thanks for clearing that up. I set up the squids Friday evening, purges have been enabled to all three french machines since then without making a difference. Right now the first real traffic is starting to hit them, with dns managed by Mark using GeoDNS (sends fr.wikipedia.org visitors to the french squids if they have a french ip).
Stats: http://bleuenn.wikimedia.org:8080/ganglia/?c=Paris% 20cluster&m=&r=hour&s=descending&hc=4
Great. What's the plan for those squids? Will they only be used for fr.wikipedia or also for de, it, pl and other european languages?
Alfio
I have been told that Africa with the exception of South Africa is better served from France then from the USA. As the African projects are small, it would be cool to serve them from France. Thanks, GerardM
wikitech-l@lists.wikimedia.org