squid idea summary

List overview All Threads
Download

newer

older

more google system quotes for the...

Celeron vs PIV

Gabriel Wicke

2 Jan 2004 2 Jan '04

1:35 p.m.

Hi,

i've written a summary of my system ideas based on squid & SuperSparrow: http://www.aulinx.de/oss/code/wikipedia/

Cheers

Gabriel Wicke

Show replies by date

Gabriel Wicke

3 Jan 3 Jan

10:46 p.m.

On Fri, 02 Jan 2004 13:35:33 +0100, Gabriel Wicke wrote:

...

http://www.aulinx.de/oss/code/wikipedia/

Update: added apachebench run- current performance from Germany is 5.37 requests/second (http://en.wikipedia.org/, 50 requests in 9.314 seconds, concurrency 7).

What kind of caching is done at the moment? And what are the current timeouts?

Gabriel Wicke

Brion Vibber

11:43 p.m.

On Jan 3, 2004, at 13:46, Gabriel Wicke wrote:

...

What kind of caching is done at the moment? And what are the current timeouts?

Every page view comes in through wiki.phtml as the entry point. This runs some setup code, defines functions/classes etc, connects to the database, normalizes the page name that's been given, and checks if a login session is active, loading user data if so.

Then the database is queried to see if the page exists and whether it's a redirect, and to get the last-touched timestamp.

If the client sent an If-Modified-Since header, we compare the given time against the last-touched timestamp (which is updated for cases where link rendering would change as well as direct edits). If it hasn't changed, we return a '304 Not Modified' code. This covers about 10% of page views.

If it's not a redirect, we're not looking at an old revision, diff, or "printable view", and we're not logged in, the file cache kicks in. This covers some 60% of page views. If saved HTML output is found for this page, it's date is checked. If it's still valid, the file is dumped out and the script exits. The cache file is a complete gzipped HTML page; if the browser doesn't advertise understanding gzip, we decompress it on the fly. (Note that this may affect benchmarks in comparison to actual browsers in use, I don't know.)

If the cached page doesn't exist or is out of date, page rendering continues as normally, and the output is compressed and saved at the end. About 2% of page views involve saving a new cached page.

There's no timeout; pages are invalidated immediately by updating their last-touched timestamps. A global cache epoch can be set on the server to invalidate all old cached pages (server- or client-side), and individual user accounts also have a cache epoch which is reset on login, when user options are changed, and when talk page notification comes on/off.

If this is a redirect, old page view, diff, or printable view, or if the user is logged in, then we don't do any server-side caching (yet) and parse/render the whole page. Some speedups have been accomplished by precaching link lookup info in easily-loadable chunks. E23's been working on storage of the HTML-rendered wiki pages to be inserted into the overall layout, but this needs some more finalization (various user options may affect the rendering of the page).

Ideally we'd be putting cached data into memcached, which can run in-memory on the web server (or as a distributed cache over a web server cluster) without grinding down the disks. So far we use memcached just for some common data (localized messages, utf8 translation tables, interwiki prefix lookup) and login sessions.

-- brion vibber (brion @ pobox.com)

Gabriel Wicke

4 Jan 4 Jan

2:33 a.m.

On Sat, 03 Jan 2004 14:43:02 -0800, Brion Vibber wrote:

Hi Brion,

thanks for this in-depth explanation! I'll add some comments regarding a possible squid integration:

...

Then the database is queried to see if the page exists and whether it's a redirect, and to get the last-touched timestamp.

If the client sent an If-Modified-Since header, we compare the given time against the last-touched timestamp (which is updated for cases where link rendering would change as well as direct edits).

Ah- so this would be the point to hook in for downstream cache invalidation (PURGE requests). Great news, then the timeout for anonymous views could be really high on the squids, practically static serving. Reallly fast...

...

if the browser doesn't advertise understanding gzip, we decompress it on the fly. (Note that this may affect benchmarks in comparison to actual browsers in use, I don't know.)

Squid would cache both the compressed and the decompressed page- Vary: Accept-Encoding, Cookie does this. It will most propably cache two or three variants of the Accept-Encoding value (Browsers differ in what they advertise).

...

A global cache epoch can be set on the server to invalidate all old cached pages (server

This would need a call to a wildcard purge for all languages.

...

and individual user accounts also have a cache epoch which is reset on login, when user options are changed, and when talk page notification comes on/off.

No squid caching here, no action.

...

If this is a redirect, old page view, diff, or printable view,

All these could most propably be cached as well. Squid caches the headers as well, so the redirect does what it's supposed to do. The printable view would need to be purged with the page, the others propably never change, i'm not shure.

...

Ideally we'd be putting cached data into memcached, which can run in-memory on the web server (or as a distributed cache over a web server cluster) without grinding down the disks. So far we use memcached just for some common data (localized messages, utf8 translation tables, interwiki prefix lookup) and login sessions.

Squid uses a freely configurable memcache (config options include size, max object size, different replacement algolorithms) by default- the bigger, the better for the disk and the better the speed of course.

What is the reason for sending a session cookie for anonymous users? User tracking? This collides with downstream caching (Vary: Cookie), so if this could be switched off the only issue would be solved.

Gabriel Wicke

Brion Vibber

3:50 a.m.

On Jan 3, 2004, at 17:33, Gabriel Wicke wrote:

...

...
If the client sent an If-Modified-Since header, we compare the given time against the last-touched timestamp (which is updated for cases where link rendering would change as well as direct edits).

Ah- so this would be the point to hook in for downstream cache invalidation (PURGE requests). Great news, then the timeout for anonymous views could be really high on the squids, practically static serving. Reallly fast...

In theory, yeah, that should rock. :)

...

Squid would cache both the compressed and the decompressed page- Vary: Accept-Encoding, Cookie does this. It will most propably cache two or three variants of the Accept-Encoding value (Browsers differ in what they advertise).

I seem to recall some trouble with Vary headers other than "Vary: User-Agent" and Internet Explorer. This can probably be worked around in some way, I hope.

...

What is the reason for sending a session cookie for anonymous users? User tracking? This collides with downstream caching (Vary: Cookie), so if this could be switched off the only issue would be solved.

There's no real good reason for it, just laziness. Session tracking isn't actually used outside of a login session, so it's just annoying for people who browse in with their browser set to prompt about cookies.

In theory, we should be able to switch on the session system only after either a) checking that a session cookie is present or b) establishing a login session on Special:Userlogin.

Note however that we want to be able to display talk page notifications to individual anonymous users, as well; currently this requires a hit to the server to check if the IP is marked. It may be sufficient to start up the cookie session on edit; most casual visitors may never get there, and talk notifications aren't too likely to happen to somebody who's not editing at the time.

-- brion vibber (brion @ pobox.com)

Gabriel Wicke

4:08 a.m.

On Sat, 03 Jan 2004 18:50:37 -0800, Brion Vibber wrote:

...

I seem to recall some trouble with Vary headers other than "Vary: User-Agent" and Internet Explorer. This can probably be worked around in some way, I hope.

I've tried Vary: * before- it created trouble in some IE version (don't recall which one). Vary: Accept-Encoding, Cookie is tested by me with anything between IE 5.0 and 6.0 SP2, Opera, Mozilla. In Plone, the system i'm working with, the users first browse the site anonymously and then log in for editing. After editing, only the Cookie is set, urls stay the same. It's working fine for my users, i'm not shure if one of them by chance uses IE/Mac, but i haven't got any complaints. If it wouldn't work for them, they wouldn't get the edit options at all ;-)

Gabriel Wicke

4:16 a.m.

On Sun, 04 Jan 2004 04:08:45 +0100, Gabriel Wicke wrote:

Obviously time to go to bed...

...

In Plone, the system i'm working with, the users first browse the site anonymously and then log in for editing. After

...

only the Cookie is set, urls stay the same. It's working fine for my users, i'm not shure if one of them by chance uses IE/Mac, but i haven't got any complaints. If it wouldn't work for them, they wouldn't get the edit options at all ;-)

Gabriel Wicke

3:06 a.m.

On Sat, 03 Jan 2004 14:43:02 -0800, Brion Vibber wrote:

Hi Brion,

...

The cache file is a complete gzipped HTML page; if the browser doesn't advertise understanding gzip, we decompress it on the fly. (Note that this may affect benchmarks in comparison to actual browsers in use, I don't know.)

i've just re-run the test with Accept-Encoding: gzip- it's 9.2 pages per second now!

This is over 100 requests, concurrency still 7. Definetely better, but still not in the hundreds ;-)

Gabriel Wicke

Jimmy Wales

5 Jan 5 Jan

5:40 p.m.

I'm studying these ideas in depth today, particularly all the stuff associated with Linux Virtual Server.

I've been talking a bit with the people at LiveJournal, who have a lot more traffic than we do, and some of the same ideas that Gabriel is advocating come up in that discussion, too.

In particular, I think that Gabriel's "artist's impression" of load balancer in front of squid in front of apache in front of mysql is the right thing to do.

That's even before we talk about what he calls "our own Akamai", which I think is very interesting and possibly very useful. But it's also something that we don't need to do right this second -- maybe in the summer, after we get migrated to our new setup. But the nice thing about the architecture that he's proposing is that we will be well-poised to go global.

Gabriel Wicke wrote:

...

Hi,

i've written a summary of my system ideas based on squid & SuperSparrow: http://www.aulinx.de/oss/code/wikipedia/

Cheers

Gabriel Wicke

Wikitech-l mailing list Wikitech-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l

Gabriel Wicke

6 Jan 6 Jan

5:54 p.m.

On Mon, 05 Jan 2004 08:40:14 -0800, Jimmy Wales wrote:

...

In particular, I think that Gabriel's "artist's impression" of load balancer in front of squid in front of apache in front of mysql is the right thing to do.

Hi Jimmy,

i've looked at Squid's failover and load balancing capabilities a bit- the short story: we could do without any separate load balancer with slightly better performance and HA.

The Details about the Squid setup are on http://www.aulinx.de/oss/code/wikipedia/.

This would free resources to throw on MySQL.

Gabriel Wicke

Jimmy Wales

7 Jan 7 Jan

12:09 a.m.

Gabriel Wicke wrote:

...

          |-----------  Bind DNS round robin
   _______|______       (multiple A records)

How well does DNS round robin work in practice as compared to a proper load balancer?

I've never tried DNS round robin because people say it sucks. But I have no actual knowledge from first hand experience. I do know that load balancing using the tools that linuxvirtualserver.org talks about works great, and gives a very good and predictable level of control.

But yeah, there'd be no problem in waiting on that and just using squid, as per the rest of your diagram.

--Jimbo

Gabriel Wicke

12:50 a.m.

On Tue, 06 Jan 2004 15:09:01 -0800, Jimmy Wales wrote:

...

Gabriel Wicke wrote:

...
          |-----------  Bind DNS round robin
   _______|______       (multiple A records)
How well does DNS round robin work in practice as compared to a proper load balancer?

I think it's the solution with the least control on how the traffic will be actually distributed. On the other hand the squids might not be the bottleneck anytime soon, and by that time it would propably make sense to install a second, possibly distributed squid layer. It might as well save some time because the work to install Heartbeat would only need to be done once on the Squids. Once a second Squid layer is installed any separate Linux Directors are not needed anymore (and the Squids will need Heartbeat then). Except maybe for balancing the database, but that's propably a different setup altogether.

...

I've never tried DNS round robin because people say it sucks. But I have no actual knowledge from first hand experience.

Me neither.

...

I do know that load balancing using the tools that linuxvirtualserver.org talks about works great, and gives a very good and predictable level of control.

I've asked the squid guys about the load balancing setup: If ICP is disabled (Apache doesn't support it) it would do a dumb round-robin without weighting. With ICP enabled (as a dumb echo on port7) it will take the weighting into account (i'm waiting for the answer on how this works out if the ISP roundtrips are all very quick in case of port7). The best solution would be an ICP daemon that delays ICP responses depending on the system load, then the system load would be 100% the same on all Apaches. If the Apaches are all similar in performance this is an academical question anyway. Linux Director also does it as round-robin with static weighting unless feedbackd is installed.

Gabriel Wicke

Neil Harris

2:34 a.m.

Gabriel Wicke wrote:

...

On Tue, 06 Jan 2004 15:09:01 -0800, Jimmy Wales wrote:

...
Gabriel Wicke wrote:

...
         |-----------  Bind DNS round robin
  _______|______       (multiple A records)
How well does DNS round robin work in practice as compared to a proper load balancer?
I think it's the solution with the least control on how the traffic will be actually distributed. On the other hand the squids might not be the bottleneck anytime soon, and by that time it would propably make sense to install a second, possibly distributed squid layer. It might as well save some time because the work to install Heartbeat would only need to be done once on the Squids. Once a second Squid layer is installed any separate Linux Directors are not needed anymore (and the Squids will need Heartbeat then). Except maybe for balancing the database, but that's propably a different setup altogether.

...
I've never tried DNS round robin because people say it sucks. But I have no actual knowledge from first hand experience.

Me neither.

...
I do know that load balancing using the tools that linuxvirtualserver.org talks about works great, and gives a very good and predictable level of control.

I've asked the squid guys about the load balancing setup: If ICP is disabled (Apache doesn't support it) it would do a dumb round-robin without weighting. With ICP enabled (as a dumb echo on port7) it will take the weighting into account (i'm waiting for the answer on how this works out if the ISP roundtrips are all very quick in case of port7). The best solution would be an ICP daemon that delays ICP responses depending on the system load, then the system load would be 100% the same on all Apaches. If the Apaches are all similar in performance this is an academical question anyway. Linux Director also does it as round-robin with static weighting unless feedbackd is installed.

Gabriel Wicke

Here's another way of doing it without the load balancers: have a pool of say 36 IP addresses (chosen because lots of numbers exactly or approximately divide into it: other numbers would do fine), and do simple DNS round-robin to balance user traffic across these. Browsers and DNS caches will cache DNS lookups, but we don't care, so long as traffic is distributed roughly evenly across the IP addresses.

Now, have N << 36 Squid servers, and allocate the IP addresses _dynamically_ to the servers. If they are all on the same Ethernet, this can be done using ARP.

Very roughly: First set every machine to serve requests on all IP addresses on each Ethernet card in the shared pool. Requests will be seen so long as that logical interface is configured "up" on that machine.

Now start a polling process: * every machine checks whether there is a live machine at each IP address in the pool at some reasonable pseudo-randomized interval, by doing an ARP request * if it sees that another machine as well as itself is serving an IP address from the pool, it will take down that logical interface, to prevent allocation collisions * when it sees there isn't, it looks at the number of addresses assigned to each machine (which it knows from its ARP scans) * am I the machine with the lowest MAC address from the set of least-assigned machines? --> take over the IP by calling 'ifconfig' for that interface

The interval is pseudo-random to break symmetry to prevent synchronization by entrainment and thus make races very unlikely. (Note that races are self-fixing, see below).

Also, excessively heavily loaded machines could also _take down_ logical interfaces, to allow other machines to take up the load. This would also be done on the basis of exceeding a "fair share": a damping mechanism would be needed to stop excess re-balancing and to aid convergence. (Margin too small to contain details of formula for stable damping thresholds and timeouts).

Upside: simple and distributed. Race conditions will be very rare (p<0.001), and rapidly corrected. If we poll each IP address once a second from each machine, a new server will be allocated to a fallen IP address within substantially less than a minute (it gets less as N gets bigger). Overhead on each machine: user-space doing an ARP probe once a second, the kernel listening to N ARP requests per second.

Downside: when an IP address is transferred from one machine to another, any HTTP transfer in progress will time out. However, as this should only occur when nodes fail, or when new or rebooted nodes are brought up within the cluster, this is probably acceptable.

Doing things this way could possibly eliminate the need for the load-balancing boxes completely, and the scripts are no more complex than the heartbeat script.

Possible problem: partial failure of a system, so it responds to network requests but not application-level requests Two possible resolutions: * watchdog-like process which performs local tests, and in the event of failure, shuts down all logical interfaces in the pool on that machine, allowing them to be picked up by other machines * remote monitoring, and remote operator intervention via SSH or the power strip, if the above fails (automated STONITH does not make sense...)

The code can written entirely in Python, and can simply use standard UNIX commands to do low-level stuff.

-- Neil

Gabriel Wicke

2:23 p.m.

On Wed, 07 Jan 2004 01:34:02 +0000, Neil Harris wrote:

...

Here's another way of doing it without the load balancers: have a pool of say 36 IP addresses (chosen because lots of numbers exactly or approximately divide into it: other numbers would do fine), and do simple DNS round-robin to balance user traffic across these. Browsers and DNS caches will cache DNS lookups, but we don't care, so long as traffic is distributed roughly evenly across the IP addresses.

I'd expect a decent Squid box (a lot of ram) to do well over 1000 Requests per second. I have no detailed information about the current traffic, but from my benchmarks i would guess the current machines do propably 100 Requests/ second all together (most likely less). So even if the simple DNS load balancing would fail completely (which is unlikely) there would be no problem in using one squid box alone. I see no point in a Heartbeat setup if one Squid couldn't do it ;-)

Gabriel Wicke

Jens Frank

8 Jan 8 Jan

1:13 p.m.

On Tue, Jan 06, 2004 at 03:09:01PM -0800, Jimmy Wales wrote:

...

Gabriel Wicke wrote:

...
          |-----------  Bind DNS round robin
   _______|______       (multiple A records)
How well does DNS round robin work in practice as compared to a proper load balancer?

I've never tried DNS round robin because people say it sucks. But I have no actual knowledge from first hand experience. I do know that load balancing using the tools that linuxvirtualserver.org talks about works great, and gives a very good and predictable level of control.

DNS round robin sucks because it (i) doesn't provide a way to check for a servers outage and (ii) is not suited for a farm of differently sized servers.

Having a heart beat between the two servers solves issue (i). In case of the outage of one server the other server takes over the IP and both round robin IPs point to the same server.

Issue (ii) can be addressed by having similar boxes. Since those have to be purchased, this should be easy to achieve.

In the office, we ran a website using DNS RR. It was about 500,000 hits per day and 52% of the request hit the "first" server, 48% the second. We changed to HW load balancers recently since the number of web servers increased.

==Sizing==

Regarding sizing of the squid boxes: We set up internal squids for web caching and noticed that squid does not support SMP. One squid process can utilize only one CPU. My proposed sizing:

1 CPU at about 2GHz 2 GB RAM 2*36G 15kRPM SCSI Disks Remote controller (rILO, eRIC) for remote "lights out" management 1 Unit rack mountable redundant power supply

Being most familiar with Compaq's Intel servers, I configured the box in their webshop and it was about 4,000$

Regarding CPU size: http://web.archive.org/web/20030605225127/hermes.wwwcache.ja.net/servers/squ... states that a dual Ultra Sparc at 170 MHz was able to process 1.8 million of requests per day. This is roughly the current load of en:. A 2 GHz x86 CPU beats the Sparc by far. If we want to spend more money on the box, add memory.

Regards,

JeLuF

Jimmy Wales

1:37 p.m.

Jens Frank wrote:

...

1 CPU at about 2GHz 2 GB RAM 2*36G 15kRPM SCSI Disks Remote controller (rILO, eRIC) for remote "lights out" management 1 Unit rack mountable redundant power supply

Being most familiar with Compaq's Intel servers, I configured the box in their webshop and it was about 4,000$

I just looked at a configuration at Silicon Mechanics. (But, I am also considering another vendor with similar prices, Penguin Computing. And I'm open to more suggestions, as always.)

SM-1151SATA, http://www.siliconmechanics.com/i1511/p4-server.php

1 CPU at 2.4GHZ P4. (They offer a cheaper Celeron, I'm not sure if the P4 is worth the extra money for webserving or not.)

2 GB Ram (2x1GB, which is a little more expensive than 4x512, but leaves slots open for future growth.)

2*80G 7.2kRPM, 8MB cache SATA drives

SATA RAID controller (we could run raid 0 or 1 for redundancy or speed - since a webserver ought not hit the disk all that much, I think redundancy is more important)

NO CD-ROM yes floppy Red Hat 9

The price on this box is $1951. Going with a cheaper processor, and (4x512) RAM, I can get that down to $1579.

----

SATA vs. SCSI -- SCSI is theoretically faster, although many say that the practical difference is minimal. SCSI is theoretically more reliable, but with 2 SATA drives in RAID redundant configuration, this is a very minor issue?

Anyhow, for reference, the same exact configurations as above, but with SCSI instead of SATA (and only 36Gb disks instead of 80Gb since disk size on the webserver boxes is not so important):

$2521/$1749.

Just for clarity in case people dozed off during the boring bits above:

SATA 80GB RAID $1951 or $1579 depending on ram/processor SCSI 36GB RAID $2521 or $1749 depending on ram/processor

----

I am thinking of buying several of these boxes, to be potentially used in these capacities:

load balancers (2) squid proxies (3) web servers (4) backup db machine (1) (this one should be stuffed with extra ram and the fastest processor, I guess)

Many alternative configurations are possible, esp. if as the current trend seems to suggest, we eliminate the load balancers and simply count on squid+heartbeat

then we would have database machine as geoffrin, the big dog, when fixed and other old hardware eventually moved to the new colo to be mailserver, etc.

Our budget is $20,000 but we have more money than that by far, donated by people who knew that we were already over $20,000. We don't want to go crazy and spend it because we have it -- better to take a wait and see attitude, because hardware will only get cheaper, and our needs will be better understoood in a few months. However, going a little bit over or under should be no cause for concern, if the purpose is valid.

--Jimbo

Alfio Puglisi

2:25 p.m.

On Thu, 8 Jan 2004, Jimmy Wales wrote:

...

SATA vs. SCSI -- SCSI is theoretically faster, although many say that the practical difference is minimal. SCSI is theoretically more reliable, but with 2 SATA drives in RAID redundant configuration, this is a very minor issue?

SCSI performance is better in practice too, especially when you have multiple hard disks. But the work those boxes will have to do (load balancers, squid, web serving, mail server) does not seem HD intensive, if they have enough RAM.

SCSI disks are usually more reliable, simply because disk manufacturers test them more and use higher quality components (hence the bigger price). If our boxes are redundant, and each box has a RAID 1 array, I don't think that going with IDE/SATA is a problem.

Alfio

Gabriel Wicke

2:37 p.m.

On Thu, 08 Jan 2004 14:25:22 +0100, Alfio Puglisi wrote:

...

On Thu, 8 Jan 2004, Jimmy Wales wrote:

...
SATA vs. SCSI -- SCSI is theoretically faster, although many say that the practical difference is minimal. SCSI is theoretically more reliable, but with 2 SATA drives in RAID redundant configuration, this is a very minor issue?

SCSI performance is better in practice too, especially when you have multiple hard disks. But the work those boxes will have to do (load balancers, squid, web serving, mail server) does not seem HD intensive, if they have enough RAM.

I agree- imo only the database servers and (if used) the fileservers for images/media need fancy hds.

Gabriel Wicke

Jens Frank

3:11 p.m.

On Thu, Jan 08, 2004 at 04:37:58AM -0800, Jimmy Wales wrote:

...

Jens Frank wrote:

...
1 CPU at about 2GHz 2 GB RAM 2*36G 15kRPM SCSI Disks Remote controller (rILO, eRIC) for remote "lights out" management 1 Unit rack mountable redundant power supply

Being most familiar with Compaq's Intel servers, I configured the box in their webshop and it was about 4,000$

I just looked at a configuration at Silicon Mechanics. (But, I am also considering another vendor with similar prices, Penguin Computing. And I'm open to more suggestions, as always.)

SM-1151SATA, http://www.siliconmechanics.com/i1511/p4-server.php

1 CPU at 2.4GHZ P4. (They offer a cheaper Celeron, I'm not sure if the P4 is worth the extra money for webserving or not.)

The 2,6 GHz is 1$ cheaper. If you want to buy the same boxes for all purpose (squid+apache), I'd go for the faster CPU.

...

2 GB Ram (2x1GB, which is a little more expensive than 4x512, but leaves slots open for future growth.)

The SM-1151SATA has non-ECC memory. Being used to ECC and even hot spare memory, I'm not comfortable with this. 2*1GB is what I'd use, too.

...

2*80G 7.2kRPM, 8MB cache SATA drives

SATA RAID controller (we could run raid 0 or 1 for redundancy or speed - since a webserver ought not hit the disk all that much, I think redundancy is more important)

Looking at pliny and larousse, from the numbers I've been told, their disks are highly busy, bi and bo are very high in vmstat.

...

NO CD-ROM yes floppy Red Hat 9

The price on this box is $1951. Going with a cheaper processor, and (4x512) RAM, I can get that down to $1579.

Comparing with the above compaq it looks cheap. It has no ECC, no remote management, no redundant power supply, no 15kRPM disks, though.

Remote management cards are really useful coping with crashed servers. It starts with being able to read the kernel panic message, being able to power off and on the server, and advanced ones being even able to insert a "virtual floppy" during booting (Remember the crash we had while trying to remotely update the kernel?). http://www.techland.co.uk/index/eric is a vendor-independent RMB. Compaq's ILO-boards are much more powerfull, though.

...

SATA vs. SCSI -- SCSI is theoretically faster, although many say that the practical difference is minimal. SCSI is theoretically more reliable, but with 2 SATA drives in RAID redundant configuration, this is a very minor issue?

There probably isn't much difference between a SCSI and a SATA interface for the same disk. The difference is mainly in the disk seek time since SCSI disks are available at 15kRPM, while SATA are mostly 7,2kRPM.

...

Anyhow, for reference, the same exact configurations as above, but with SCSI instead of SATA (and only 36Gb disks instead of 80Gb since disk size on the webserver boxes is not so important):

$2521/$1749.

Just for clarity in case people dozed off during the boring bits above:

SATA 80GB RAID $1951 or $1579 depending on ram/processor SCSI 36GB RAID $2521 or $1749 depending on ram/processor

Strange that the difference is 600$ for one configuration and 170$ for the other. Thinking of Squid, I'd rather spend the money for 2 Gigs of additional memory than for the faster disks. Memory should dramatically reduce I/O load on proxies. 2 additional GB are 762$.

...

I am thinking of buying several of these boxes, to be potentially used in these capacities:

load balancers (2) squid proxies (3) web servers (4) backup db machine (1) (this one should be stuffed with extra ram and the fastest processor, I guess)

Many alternative configurations are possible, esp. if as the current trend seems to suggest, we eliminate the load balancers and simply count on squid+heartbeat

You'd need heartbeat for the load balancers, anyway. Doing it on the squids would just save two boxes.

Why three squids? Two would be enough, and even one should be able to handle the load if the other one fails.

Should the DB backup machine really have a different sizing from the active machine? From what I've seen brion say geoffrin is rather idle, so perhaps yes. On the other hand side, administration would be easier when having two identical boxes, or at least same CPU family. A small Altus 1000 E dual Opteron (single Opterons are not offered by penguincomputing?) with 2 GB RAM (ECC!), 2*80 disk, no CDROM, is at 2,810 $. We could probably get away with two of these for web serving instead of four, and save 2 units of rackspace.

2 squids, single CPU (Relion 1X, 2GB, 2,66 GHz P4, 2*80 GB) 2x 2,150$ 2 webservers, dual Opteron (Altus 1000E, 2GB, 2*Opteron 240 2*80GB) 2x 2,810$ 1 DB Backup box (same) 1x 2,810$ ========= Total 12,730$

Using 4 webservers at $1951 each would cost 2,184$ more, plus 2RU space consumption.

...

then we would have database machine as geoffrin, the big dog, when fixed and other old hardware eventually moved to the new colo to be mailserver, etc.

Our budget is $20,000 but we have more money than that by far, donated by people who knew that we were already over $20,000. We don't want to go crazy and spend it because we have it -- better to take a wait and see attitude, because hardware will only get cheaper, and our needs will be better understoood in a few months. However, going a little bit over or under should be no cause for concern, if the purpose is valid.

Very good point.

Regards,

JeLuF

PS: Sorry if this became to long.

Gabriel Wicke

3:31 p.m.

On Thu, 08 Jan 2004 15:11:09 +0100, Jens Frank wrote:

...

...
2*80G 7.2kRPM, 8MB cache SATA drives

SATA RAID controller (we could run raid 0 or 1 for redundancy or speed - since a webserver ought not hit the disk all that much, I think redundancy is more important)

Looking at pliny and larousse, from the numbers I've been told, their disks are highly busy, bi and bo are very high in vmstat.

This should be the current cache- it could be switched off once the Squids are working. I even don't see a need for Apache logging- the logging is done on the Squids. The only potentially disk-accessing operation would be serving images which will get picked up by the squids as well.

My proposal for the Apaches is heaps of Cpu, single ordinary drive and enough memory to prevent any swapping. And some more of them. We don't need high reliability as long as we have enough of them.

...

...
SATA vs. SCSI -- SCSI is theoretically faster, although many say that the practical difference is minimal. SCSI is theoretically more reliable, but with 2 SATA drives in RAID redundant configuration, this is a very minor issue?

I might be wrong, but the only gain it having a big disk might be a faster start up time ;-)

...

You'd need heartbeat for the load balancers, anyway. Doing it on the squids would just save two boxes.

Why three squids? Two would be enough, and even one should be able to handle the load if the other one fails.

...

Should the DB backup machine really have a different sizing from the active machine? From what I've seen brion say geoffrin is rather idle, so perhaps yes. On the other hand side, administration would be easier when having two identical boxes, or at least same CPU family. A small Altus 1000 E dual Opteron (single Opterons are not offered by penguincomputing?) with 2 GB RAM (ECC!), 2*80 disk, no CDROM, is at 2,810 $. We could probably get away with two of these for web serving instead of four, and save 2 units of rackspace.

+1 again ;-)

Gabriel Wicke

Jens Frank

3:57 p.m.

On Thu, Jan 08, 2004 at 03:31:00PM +0100, Gabriel Wicke wrote:

...

On Thu, 08 Jan 2004 15:11:09 +0100, Jens Frank wrote:

...
...
2*80G 7.2kRPM, 8MB cache SATA drives

Looking at pliny and larousse, from the numbers I've been told, their disks are highly busy, bi and bo are very high in vmstat.

This should be the current cache- it could be switched off once the Squids are working. I even don't see a need for Apache logging- the logging is done on the Squids. The only potentially disk-accessing operation would be serving images which will get picked up by the squids as well.

Agreed.

...

My proposal for the Apaches is heaps of Cpu, single ordinary drive and enough memory to prevent any swapping. And some more of them. We don't need high reliability as long as we have enough of them.

Jimbo, how much rackspace can we use?

...

...
You'd need heartbeat for the load balancers, anyway. Doing it on the squids would just save two boxes.

Why three squids? Two would be enough, and even one should be able to handle the load if the other one fails.

+1

Plus one box or plus one who thinks the same?

JeLuF

Gabriel Wicke

3:58 p.m.

On Thu, 08 Jan 2004 15:57:20 +0100, Jens Frank wrote:

...

...
+1

Plus one vote ;-)

Cheers

Gabriel

Jimmy Wales

10:52 p.m.

Jens Frank wrote:

...

Jimbo, how much rackspace can we use?

Rackspace isn't an issue at all.

When I asked the world for $20,000, my intention is that this will see us through a pretty steep growth curve for awhile. I'd like for us to setup a system now that's robust and has a very large amount of reserve capacity, so that as far as we can manage, the site will be highly responsive to users for some time to come *even in a case where we have had bad hardware luck*.

If two squids are good enough and we can survive on one, then I want three. If 3 webservers are handling the load just fine, and we can survive on 2, then I want 5. If geoffrin dies, I want to have another machine that can take the load without breaking a sweat.

--Jimbo

Jens Frank

11:07 p.m.

On Thu, Jan 08, 2004 at 01:52:13PM -0800, Jimmy Wales wrote:

...

Jens Frank wrote:

...
Jimbo, how much rackspace can we use?

Rackspace isn't an issue at all.

Good to hear. Knowing the prices for colo's in the US, I have to say "Thank you".

...

When I asked the world for $20,000, my intention is that this will see us through a pretty steep growth curve for awhile. I'd like for us to setup a system now that's robust and has a very large amount of reserve capacity, so that as far as we can manage, the site will be highly responsive to users for some time to come *even in a case where we have had bad hardware luck*.

If two squids are good enough and we can survive on one, then I want three. If 3 webservers are handling the load just fine, and we can survive on 2, then I want 5. If geoffrin dies, I want to have another machine that can take the load without breaking a sweat.

You said something else earlier today, that we shouldn't spend the money just because we have it. Being tight-fisted, I somewhat have a problem with "overdelivery". Redundancy is fine, we need it. Having a spare server on site is probably cheaper than a service contract. But 3? Please, someone covince me that we won't find a better way to spend the money in half a year.

Regards,

JeLuF

Jimmy Wales

11:16 p.m.

Jens Frank wrote:

...

You said something else earlier today, that we shouldn't spend the money just because we have it. Being tight-fisted, I somewhat have a problem with "overdelivery". Redundancy is fine, we need it. Having a spare server on site is probably cheaper than a service contract. But 3? Please, someone covince me that we won't find a better way to spend the money in half a year.

I hear what you're saying. With the right scalable architecture, adding capacity in the future should be as easy as adding a new server or two, copying over some software, and editing a few config files. It seems obvious that buying too much hardware too early is not good, because cheaper and better hardware will be available 6 months from now.

Even so, I don't want us to be too stingy. What I'm hearing loud and clear is that people want wikipedia to be fast, highly responsive, and reliable. We don't want to be sketchy about reaching those goals, we want to embrace those goals wholeheartedly and make sure we aren't sitting on money in the bank while the site lags.

--Jimbo

Gabriel Wicke

9 Jan 9 Jan

12:47 a.m.

On Thu, 08 Jan 2004 14:16:48 -0800, Jimmy Wales wrote:

...

Even so, I don't want us to be too stingy. What I'm hearing loud and clear is that people want wikipedia to be fast, highly responsive, and reliable. We don't want to be sketchy about reaching those goals, we want to embrace those goals wholeheartedly and make sure we aren't sitting on money in the bank while the site lags.

In the case of the Squids it wouldn't exactly be a case of 'survive on one'. A Pentium 2.4 or similar with 4Gigs of Ram will do at the very minimum 15 times the current traffic, i would bet it will do 25 times the current load. I fear wikipedia won't be that big in three months time...

...

From the current plans it seems that there will be two database servers-

if we went for three Squids it would make sense to buy a third DB as well. Same for any load balancers (if we wnat any) and file servers. That would be very expensive. It's not hard to add another Squid if one goes down for a longer time (no data to transfer for once), but i would guess it's harder to do with the DB. The other thing is that three Squids might be harder to connect with Heartbeat- i'm pretty shure it's possible, but definetely more work. How much ram is in larousse/pliny? One of them could be readily configured as a third (possibly standby) Squid and DNS/mailserver- the Squids are not demanding after all.

I'd propose to get going as soon as possible with at least 6 Apaches, 2 Squids, 2 DBs and 2 Fileservers (propably one of them combined with DB?), plus mail/dns box. Once that's running we can add a third Squid and more Apaches or Dbs- depending on the load we observe.

And we can do some testing with mirrors (possibly including Squid3/ ESI) until then. Somebody has to pay the bandwidth after all ;-)

-- Gabriel Wicke

Jimmy Wales

1:13 p.m.

Gabriel Wicke wrote:

...

In the case of the Squids it wouldn't exactly be a case of 'survive on one'. A Pentium 2.4 or similar with 4Gigs of Ram will do at the very minimum 15 times the current traffic, i would bet it will do 25 times the current load.

Oh, o.k., I'm convinced then. 1 squid with a heartbeat friend ready to takeover (and/or load balanced as well) is good.

...

I'd propose to get going as soon as possible with at least 6 Apaches, 2 Squids, 2 DBs and 2 Fileservers (propably one of them combined with DB?), plus mail/dns box. Once that's running we can add a third Squid and more Apaches or Dbs- depending on the load we observe.

I like the idea of 6 apaches.

I want to think more about fileservers today. I'm assuming that those are primarily for the images? Can/should they be more or less the same configuration as the 6 apaches?

I think there's enormous flexibility benefits of having lots of identical hardware, *when it makes sense* to do so. When it doesn't make sense, it doesn't. I.E. if we have 1 box that needs to be 'bigger' in some ways, there's no sense in spending money on 6 other boxes just to make them all identical.

But if the requirements are roughly similar for different roles, then we shouldn't try to overspecialize the hardware, I think.

Gabriel Wicke

3:33 p.m.

On Fri, 09 Jan 2004 04:13:57 -0800, Jimmy Wales wrote:

...

I want to think more about fileservers today. I'm assuming that those are primarily for the images? Can/should they be more or less the same configuration as the 6 apaches?

Nooo- the opposite! Apaches don't use their disk but need a lot of cpu, the fileservers will need little cpu and a good RAID1. It's only about 2 Gigs of pics at the moment (are there sound files yet?), let's just store them on the DB servers' filesystem. Those will be connected with heartbeat anyway, no extra work for failover. If wikipedia suddenly starts to use a lot of big media files we can easily move the files to separate servers.

...

I think there's enormous flexibility benefits of having lots of identical hardware, *when it makes sense* to do so. When it doesn't make sense, it doesn't. I.E. if we have 1 box that needs to be 'bigger' in some ways, there's no sense in spending money on 6 other boxes just to make them all identical.

Especially for the Apaches- they'd cost at least double the necessary price while still being unsuitable for DB or Squid.

...

But if the requirements are roughly similar for different roles, then we shouldn't try to overspecialize the hardware, I think.

The only real specialization for the Apaches is no Raid. Easy to add (what for, by the way?). We could stick more ram into the apaches and use them as Squids, i don't think it's feasible to use them for the main DB. Although- ram does wonders...

We should really get information about the minimum ram needed for the Apaches!

-- Gabriel Wicke

Jimmy Wales

4:44 p.m.

Gabriel Wicke wrote:

...

Nooo- the opposite! Apaches don't use their disk but need a lot of cpu, the fileservers will need little cpu and a good RAID1. It's only about 2 Gigs of pics at the moment (are there sound files yet?), let's just store them on the DB servers' filesystem. Those will be connected with heartbeat anyway, no extra work for failover. If wikipedia suddenly starts to use a lot of big media files we can easily move the files to separate servers.

I agree. For now I will forget about separate fileserver machines.

--Jimbo

Jimmy Wales

4:46 p.m.

Gabriel Wicke wrote:

...

We should really get information about the minimum ram needed for the Apaches!

Also, in the short term, if we get a ram configuration that's at least acceptable, this is one of the easiest and likely cheapest possible upgrades in the future. If we start with 1 gig each in many boxes and find that we're starting to strain a bit, then at that time, under real world conditions, we could study the situation to determine if more boxes (i.e. more cpu, etc.) is the answer, or just more ram.

--Jimbo

Rob Hooft

8 Jan 8 Jan

11:34 p.m.

Jens Frank wrote:

...

On Thu, Jan 08, 2004 at 01:52:13PM -0800, Jimmy Wales wrote:

...
When I asked the world for $20,000, my intention is that this will see us through a pretty steep growth curve for awhile. I'd like for us to setup a system now that's robust and has a very large amount of reserve capacity, so that as far as we can manage, the site will be highly responsive to users for some time to come *even in a case where we have had bad hardware luck*.

If two squids are good enough and we can survive on one, then I want three. If 3 webservers are handling the load just fine, and we can survive on 2, then I want 5. If geoffrin dies, I want to have another machine that can take the load without breaking a sweat.

You said something else earlier today, that we shouldn't spend the money just because we have it. Being tight-fisted, I somewhat have a problem with "overdelivery". Redundancy is fine, we need it. Having a spare server on site is probably cheaper than a service contract. But 3? Please, someone covince me that we won't find a better way to spend the money in half a year.

I agree with this wholeheartedly. The same money now used on "useless" redundancy or "growth capacity" will buy 50% more in 6 months.

I am advising everyone in my environment to take always "one step back" in hardware, this will be three steps less in price. Remember that the top of the line at any moment is out of date in two months time anyway, so you can just as well buy 2 months old technology right now.

A similar argument holds for memory capacity. I have often seen that people including myself bought an "extensible" computer, but before you actually want to extend it the parts are no longer available or perform badly in your system. If you're really going to be growing memory to 16GB on a system in the future, buying the whole 16GB at that point may be more reliable and cheaper than going for very special modules right now.

ECC is slower and more expensive, but it will prevent an eventual memory error from propagating into disk files thereby potentially corrupting data. This might be advisable to a project like Wikipedia, but only on database and/or file servers.

The only thing where I have personally seen that "expensive pays" is in SCSI vs. ATA (no experience with SATA): The SCSI protocol is so much smarter that it can be not 2 but 10 times faster than ATA if addressed in parallel by concurrent processes. An ftp server working with ATA that I have used would grind to a standstill whenever a backup was running....

Rob

-- Rob W.W. Hooft || rob@hooft.net || http://www.hooft.net/people/rob/

Ricky Beam

9 Jan 9 Jan

12:33 a.m.

On Thu, 8 Jan 2004, Rob Hooft wrote:

...

ECC is slower and more expensive, but it will prevent an eventual memory error from propagating into disk files thereby potentially corrupting data. This might be advisable to a project like Wikipedia, but only on database and/or file servers.

ECC memory is just as fast as non-ECC memory. In fact, they are exactly the same. ECC modules have more bits to accommodate the error correction bits. The actual correction data is handled in hardware by the motherboard chipset and is thus "free". We aren't in the world of 3-chip fake parity memory of yesturyear.

Memory bit faults are more likely to cause the machine to fail than corrupt data on disk. Having seen numerous ECC "corrected bit faults" from various machines and other non-ECC equiped machines simply panic or just hard lock, I highly recommend the added expense.

(And depending on the quality of your server hardware, you may be REQUIRED to use registered ECC memory.)

...

The only thing where I have personally seen that "expensive pays" is in SCSI vs. ATA (no experience with SATA): The SCSI protocol is so much smarter that it can be not 2 but 10 times faster than ATA if addressed in parallel by concurrent processes. An ftp server working with ATA that I have used would grind to a standstill whenever a backup was running....

To be fair, SCSI and modern ATA are basically the same command set. The difference is the physical layer. A SCSI bus will beat an IDE bus EVERY SINGLE TIME. SATA is much more closely aligned to SCSI, but until there are "native" SATA drives -- vs. IDE drives with integrate SATA<->IDE bridges, I'm staying away from them. (they do exist, it's nearly impossible to tell without using it.)

(FWIW, I can get 21MB/s from my maxtor IDE drives and 61MB/s from my seagate fibre channel drives. Seagate's latest 200GB drives get about 50MB/s, but I've not actually put one to task, yet.)

--Ricky

Jens Frank

4:03 a.m.

New subject: Move of datacenter

Jimbo,

will the new servers be installed directly to the new colo or go to California first?

JeLuF

Jimmy Wales

1:16 p.m.

New subject: Move of datacenter

Jens Frank wrote:

...

will the new servers be installed directly to the new colo or go to California first?

Directly in the new colo, because moving them once they are in service would be very difficult. One of the things that has persuaded me to move is that both Bomis and Wikimedia are about to undergo major hardware upgrades, thus providing an easy opportuity for migration to a new facility.

The main thing wrong with our current facility in San Diego is that it's not staffed 24/7. We have access 24/7, but only one person in my organization lives in San Diego anymore, and he sometimes goes on vacation.

--Jimbo

Jimmy Wales

8 Jan 8 Jan

11:03 p.m.

Jens Frank wrote:

...

...
1 CPU at 2.4GHZ P4. (They offer a cheaper Celeron, I'm not sure if the P4 is worth the extra money for webserving or not.)

The 2,6 GHz is 1$ cheaper. If you want to buy the same boxes for all purpose (squid+apache), I'd go for the faster CPU.

To be sure that I understand. For essentially the same price, you recommend a 2.6Ghz Celeron over a 2.4Ghz Pentium? That sounds fine to me, I just don't know how the differences between Celeron and Pentium affect performance for webserving.

...

The SM-1151SATA has non-ECC memory. Being used to ECC and even hot spare memory, I'm not comfortable with this. 2*1GB is what I'd use, too.

Can you elaborate on this? All my recent server purchases have been from Penguin, and they are always ECC memory, which costs more and performs less, but which is supposed to be more reliable. But is it really? How much more reliable? Or have I misunderstood the difference completely?

My new G5 uses non-ECC ram, and it seems to work just fine. I have had plenty of bad sticks of ECC ram in my life.

Would you consider the use of non-ECC ram to be a "deal killer" on the SM-1151SATA, or just something that's not quite ideal?

...

Remote management cards are really useful coping with crashed servers. It starts with being able to read the kernel panic message, being able to power off and on the server, and advanced ones being even able to insert a "virtual floppy" during booting (Remember the crash we had while trying to remotely update the kernel?). http://www.techland.co.uk/index/eric is a vendor-independent RMB. Compaq's ILO-boards are much more powerfull, though.

*nod* I'll look into these.

How does a remote management card compare to serial console redirection?

I couldn't find a price for that Eric card, what do remote server management cards cost, generally?

Keep in mind that we are going to have easy hands-on access to the servers, and we're going to be in a facility that's staffer 24x7 with free "remote hands" service for things like pushing a reset button and typing a few things on the console, sticking in a floppy, etc.

...

...
Just for clarity in case people dozed off during the boring bits above:

SATA 80GB RAID $1951 or $1579 depending on ram/processor SCSI 36GB RAID $2521 or $1749 depending on ram/processor

Strange that the difference is 600$ for one configuration and 170$ for the other.

I see a difference of $372 and $772. You're right, though, it is strange. (Oh, I see, you were differencing in the other direction, but either way, there is something odd here. I will study it to make sure I didn't make a mistake.)

...

Thinking of Squid, I'd rather spend the money for 2 Gigs of additional memory than for the faster disks. Memory should dramatically reduce I/O load on proxies. 2 additional GB are 762$.

And if this is non-ECC ram, then perhaps 2 additional GB are less than that from other vendors.

...

PS: Sorry if this became to long.

I want everyone to write things this long, it's the only way for us to really study this out. :-) I will read every word that everyone says, because I have a very strong responsibility to make wise purchases with other people's money.

--Jimbo

Gabriel Wicke

9 Jan 9 Jan

1:20 a.m.

On Thu, 08 Jan 2004 14:03:19 -0800, Jimmy Wales wrote:

...

Jens Frank wrote:

...
...
1 CPU at 2.4GHZ P4. (They offer a cheaper Celeron, I'm not sure if the P4 is worth the extra money for webserving or not.)

The 2,6 GHz is 1$ cheaper. If you want to buy the same boxes for all purpose (squid+apache), I'd go for the faster CPU.

To be sure that I understand. For essentially the same price, you recommend a 2.6Ghz Celeron over a 2.4Ghz Pentium? That sounds fine to me, I just don't know how the differences between Celeron and Pentium affect performance for webserving.

I think for the Apaches we should look for the most cpu we can get for the money- no remote control, no RAID, but enough ram to not swap. It's very likely that we could get much more cpu power for the money by buying nine modest 2Ghz machines rather than five 2.6Ghz Pentium4 with RAID. Prices are usually disproportionally higher for the latest compared to the average, latest four-months-ago machines. Double price often doesn't buy double performance. Rack space doesn't seem to be a problem, so the only drawback would be the installation work. This shouldn't be that bad- after configuring one machine we would just copy over the image to the others anyway.

...

...
The SM-1151SATA has non-ECC memory. Being used to ECC and even hot spare memory, I'm not comfortable with this. 2*1GB is what I'd use, too.

What's the current memory consumption on pliny/larousse?

(...)

...

Keep in mind that we are going to have easy hands-on access to the servers, and we're going to be in a facility that's staffer 24x7 with free "remote hands" service for things like pushing a reset button and typing a few things on the console, sticking in a floppy, etc.

For me, all these things just mean that having cheap but many Apaches would bring more performance.

I woudn't want to use a cheap machine for the DB- or file servers and not for the squids (they benefit a lot from ram), but the Apaches are the place where the extra dollars for high-end hardware would buy very little.

What's google based on? Are they even using desktop pcs for the cpu work?

-- Gabriel Wicke

Jimmy Wales

1:22 p.m.

Gabriel Wicke wrote:

...

It's very likely that we could get much more cpu power for the money by buying nine modest 2Ghz machines rather than five 2.6Ghz Pentium4 with RAID.

Well, the lowest machines that I can find widely available are 2.4Ghz... the state of the art these days is 3.2Ghz. Upgrading from a 2.4Ghz to 2.6ghz is only $28 per CPU (for the celerons).

Do you know of a vendor where I could buy 9 2Ghz machines for the price of 5 2.6Ghz machines? Or was that just a hypothetical example?

Someone offered to let me in on their deal with a vendor for last-generation Pentium III boxes, at $800 each. For simple webserving, does it make sense to look at those? Obviously we could get twice as many of those for the money...

At some point, of course, the management of a lot of servers becomes a hassle. But I don't suppose there's much difference between managing 6 identical webservrs and 12.

--Jimbo

Gabriel Wicke

3:21 p.m.

On Fri, 09 Jan 2004 04:22:05 -0800, Jimmy Wales wrote:

...

Gabriel Wicke wrote:

...

Well, the lowest machines that I can find widely available are 2.4Ghz... the state of the art these days is 3.2Ghz. Upgrading from a 2.4Ghz to 2.6ghz is only $28 per CPU (for the celerons).

I've found the cpu to be much less of a price factor for the Apache-class servers than the form factor and ram.

...

Do you know of a vendor where I could buy 9 2Ghz machines for the price of 5 2.6Ghz machines? Or was that just a hypothetical example?

Shure, dell for one ;-) Tower pcs might be impossible in the colo, but they would save a lot of money while offering the same performance. They're simply produced in different quantities than those 19" boxes.

...

Someone offered to let me in on their deal with a vendor for last-generation Pentium III boxes, at $800 each. For simple webserving, does it make sense to look at those? Obviously we could get twice as many of those for the money...

See my post re call for configurations- we can get a P4 2.4 for that.

...

At some point, of course, the management of a lot of servers becomes a hassle. But I don't suppose there's much difference between managing 6 identical webservrs and 12.

If we have many and one goes down, then does it really matter? We could wait for another one to fail so it's worth the hassle. Spare parts would be the same for all anyway (and commonly available).

I'm not a hardware expert- should we avoid companies like Dell? Is there a quality problem with their hardware? I've only looked at their website so far (because i don't know really cheap companies in the us). In Germany there are other major hardware vendors that are cheaper than Dell with the same support/warranty level.

-- Gabriel Wicke

Jimmy Wales

4:43 p.m.

Gabriel Wicke wrote:

...

I'm not a hardware expert- should we avoid companies like Dell? Is there a quality problem with their hardware?

Dell is a bit expensive compared to smaller outfits that specialize in server hardware. I've used penguincomputing.com a lot in the past, with good results. I have 1 server from racksaver.com, it's a good machine, but on their website the other day, they seem to be specializing more in the cluster/beowulf world than commodity webservers. And finally, just recently the CEO of livejournal.com recommended siliconmechanics to me.

--Jimbo

Ricky Beam

8:24 p.m.

On Fri, 9 Jan 2004, Gabriel Wicke wrote:

...

I'm not a hardware expert- should we avoid companies like Dell? Is there a quality problem with their hardware?

There's no reason to avoid Dell. Their machines work. And when they don't, Dell has a warehouse of parts to fix it. And, they're hardware isn't unreasonablly priced (esp. if you shop at their "outlet" web store -- returns, refused orders, and other such dumb stuff that makes things "not new".)

--Ricky

Jens Frank

3:53 a.m.

On Thu, Jan 08, 2004 at 02:03:19PM -0800, Jimmy Wales wrote:

...

Jens Frank wrote:

...
...
1 CPU at 2.4GHZ P4. (They offer a cheaper Celeron, I'm not sure if the P4 is worth the extra money for webserving or not.)

The 2,6 GHz is 1$ cheaper. If you want to buy the same boxes for all purpose (squid+apache), I'd go for the faster CPU.

To be sure that I understand. For essentially the same price, you recommend a 2.6Ghz Celeron over a 2.4Ghz Pentium? That sounds fine to me, I just don't know how the differences between Celeron and Pentium affect performance for webserving.

No, I recommend the 2.6 Pentium, according to their web shop, the 2.6GHz version is cheaper than the 2.4GHz Pentium.

...

...
The SM-1151SATA has non-ECC memory. Being used to ECC and even hot spare memory, I'm not comfortable with this. 2*1GB is what I'd use, too.

Can you elaborate on this? All my recent server purchases have been from Penguin, and they are always ECC memory, which costs more and performs less, but which is supposed to be more reliable. But is it really? How much more reliable? Or have I misunderstood the difference completely?

Without ECC, you can detect a memory fault. Kernel panic.

With ECC, you get a message to the log file and can call Penguin to send a new module before it's completely broken.

...

My new G5 uses non-ECC ram, and it seems to work just fine. I have had plenty of bad sticks of ECC ram in my life.

Except for "dead on arrival" I've never had a failure of an entire ECC stick. But might be that the price difference makes a little quality difference, too.

...

Would you consider the use of non-ECC ram to be a "deal killer" on the SM-1151SATA, or just something that's not quite ideal?

The later. ECC is just like redundant power supply.

...

...
Remote management cards are really useful coping with crashed servers. It starts with being able to read the kernel panic message, being able to power off and on the server, and advanced ones being even able to insert a "virtual floppy" during booting (Remember the crash we had while trying to remotely update the kernel?). http://www.techland.co.uk/index/eric is a vendor-independent RMB. Compaq's ILO-boards are much more powerfull, though.

*nod* I'll look into these.

How does a remote management card compare to serial console redirection?

It provides a way to hit reset and power off/power on. But having remote helping hands, we perhaps won't need this.

...

I couldn't find a price for that Eric card, what do remote server management cards cost, generally?

Compaq charges some 600 US$

...

Keep in mind that we are going to have easy hands-on access to the servers, and we're going to be in a facility that's staffer 24x7 with free "remote hands" service for things like pushing a reset button and typing a few things on the console, sticking in a floppy, etc.

Yup. Didn't know this when proposing the remote cards.

...

...
Thinking of Squid, I'd rather spend the money for 2 Gigs of additional memory than for the faster disks. Memory should dramatically reduce I/O load on proxies. 2 additional GB are 762$.

And if this is non-ECC ram, then perhaps 2 additional GB are less than that from other vendors.

Those were the prices from SM's web configuration tool.

JeLuF

Jimmy Wales

1:25 p.m.

Jens Frank wrote:

...

No, I recommend the 2.6 Pentium, according to their web shop, the 2.6GHz version is cheaper than the 2.4GHz Pentium.

Oh, I see what you are saying now, and you're right. That's curious, isn't it?

...

It provides a way to hit reset and power off/power on. But having remote helping hands, we perhaps won't need this.

*nod* We also have a remote power port now, donated.

--Jimbo

Jimmy Wales

8 Jan 8 Jan

1:47 p.m.

One last thought...

Jens Frank wrote:

...

1 CPU at about 2GHz 2 GB RAM 2*36G 15kRPM SCSI Disks Remote controller (rILO, eRIC) for remote "lights out" management 1 Unit rack mountable redundant power supply

Jens priced this configuration at $4000. I was able to replicate the same performance configuration for $2500 or less. I think that for $2000, or half the price, I could match the performance of the box he describes.

However, my proposal does not include either remote controllers nor redundant power supplies. Here is my thinking, please think with me if this is right or not.

1. If my server costs $2000, and Jens costs $4000, then I don't need redundant power supplies since I have redundant *servers*. With only a little bit of work to make sure that dead machines are taken out of service, we have a lot more capacity for the same money.

2. A remote controller would be sweet in our current facility. But we are going to move to a facility with the following features -- (1) 24 hour a day remote "hands on" service (2) geographically close to me and my employees so that we can physically drive over (more on this later today or tomorrow after I tour one more facility).

--Jimbo

Hr. Daniel Mikkelsen

1:54 p.m.

On Thu, 8 Jan 2004, Jimmy Wales wrote:

...

If my server costs $2000, and Jens costs $4000, then I don't need

redundant power supplies since I have redundant *servers*. With only a little bit of work to make sure that dead machines are taken out of service, we have a lot more capacity for the same money.

You don't have to buy brand servers to get redundant power supplies. You get fairly cheap generic rackmount boxes with this feature.

...

A remote controller would be sweet in our current facility. But

we are going to move to a facility with the following features -- (1) 24 hour a day remote "hands on" service (2) geographically close to me and my employees so that we can physically drive over (more on this later today or tomorrow after I tour one more facility).

You also get fairly cheap video adapter cards with serial output. Which gives you access to bios, the boot procedure, etc.

-- Daniel

Ricky Beam

5:47 p.m.

On Thu, 8 Jan 2004, Hr. Daniel Mikkelsen wrote:

...

...

A remote controller would be sweet in our current facility. But

we are going to move to a facility with the following features -- (1) 24 hour a day remote "hands on" service (2) geographically close to me and my employees so that we can physically drive over (more on this later today or tomorrow after I tour one more facility).

You also get fairly cheap video adapter cards with serial output. Which gives you access to bios, the boot procedure, etc.

Any server class motherboard BIOS will have "BIOS Console Redirection". It is rather stupid and crappy, but it *does* work and doesn't cost 1000$.

I'm using this on Tyan S2721-533's right now. (the onboard video is disabled.) (And my S2460 dual athlon "desktop" too.)

--Ricky

Jens Frank

3:46 p.m.

On Thu, Jan 08, 2004 at 04:47:16AM -0800, Jimmy Wales wrote:

...

One last thought...

Jens Frank wrote:

...
1 CPU at about 2GHz 2 GB RAM 2*36G 15kRPM SCSI Disks Remote controller (rILO, eRIC) for remote "lights out" management 1 Unit rack mountable redundant power supply

Jens priced this configuration at $4000. I was able to replicate the same performance configuration for $2500 or less. I think that for $2000, or half the price, I could match the performance of the box he describes.

However, my proposal does not include either remote controllers nor redundant power supplies. Here is my thinking, please think with me if this is right or not.

If my server costs $2000, and Jens costs $4000, then I don't need

redundant power supplies since I have redundant *servers*. With only a little bit of work to make sure that dead machines are taken out of service, we have a lot more capacity for the same money.

If rack space is available, this is true.

...

A remote controller would be sweet in our current facility. But

we are going to move to a facility with the following features -- (1) 24 hour a day remote "hands on" service (2) geographically close to me and my employees so that we can physically drive over (more on this later today or tomorrow after I tour one more facility).

This sounds good. I appreciate Jason's help, but it's something else to have remote hands on site.

Regards,

JeLuF

Gabriel Wicke

2:58 p.m.

On Thu, 08 Jan 2004 13:13:58 +0100, Jens Frank wrote:

...

In the office, we ran a website using DNS RR. It was about 500,000 hits per day and 52% of the request hit the "first" server, 48% the second. We changed to HW load balancers recently since the number of web servers increased.

The default configuration for using Linux Director would be no load balancing at all and one machine on standby with Heartbeat. We could do the same with the squids, but it makes more sense to use both machine's ram.

...

2*36G 15kRPM SCSI Disks

If money was short i'd propose to buy 4Gb ram and see how much money is left for a fancy hd. I theory the hd should be used only for booting and logging. It might make sense to go single-processor 64bit for the Squids- not for cpu, but for future memory expansion. I think Raid isn't necessary for the squid if the logging is done to a separate disk.

...

Regarding CPU size: http://web.archive.org/web/20030605225127/hermes.wwwcache.ja.net/servers/squ... states that a dual Ultra Sparc at 170 MHz was able to process 1.8 million of requests per day. This is roughly the current load of en:. A 2 GHz x86 CPU beats the Sparc by far. If we want to spend more money on the box, add memory.

Agree completely. My Celeron 2Ghz does about 700 html requests/second at 54% cpu (siege load tester taking up the rest). That's 60.2 million requests per day.

Gabriel Wicke

Jimmy Wales

11:06 p.m.

Gabriel Wicke wrote:

...

The default configuration for using Linux Director would be no load balancing at all and one machine on standby with Heartbeat. We could do the same with the squids, but it makes more sense to use both machine's ram.

I'm not sure I'm following you here. If we have 2 Linux Directors, one on standby, but they aren't doing load balancing, then what are they for? I thought the director would be in charge of (iptables style) load balancing the squids?

...

for cpu, but for future memory expansion. I think Raid isn't necessary for the squid if the logging is done to a separate disk.

*nod* But RAID could be used in mirroring mode to increase reliability, presumably...

I'm listening to your thought that maybe going with a 64 bit CPU might make more sense just from the point of view of more RAM potential in the future. It's just very hard today to predict whether that's the right path to take or not.

One can only assume that RAM will be a lot cheaper in the future, and it sure would be nice to spend $1600 per machine to upgrade to 16Gb. :-)

--Jimbo

Gabriel Wicke

9 Jan 9 Jan

midnight

On Thu, 08 Jan 2004 14:06:57 -0800, Jimmy Wales wrote:

...

Gabriel Wicke wrote:

...
The default configuration for using Linux Director would be no load balancing at all and one machine on standby with Heartbeat. We could do the same with the squids, but it makes more sense to use both machine's ram.

I'm not sure I'm following you here. If we have 2 Linux Directors, one on standby, but they aren't doing load balancing, then what are they for? I thought the director would be in charge of (iptables style) load balancing the squids?

Yes, but the two Linux Director wouldn't be load balanced because each of them can deal with the full traffic. Same for the Squids, but it wouldn't hurt to use them both.

...

...
for cpu, but for future memory expansion. I think Raid isn't necessary for the squid if the logging is done to a separate disk.

*nod* But RAID could be used in mirroring mode to increase reliability, presumably...

Yes, i'm not against RAID at all- i just wanted to stretch that 4Gigs of ram might be more important than the most advanced five-dics RAID. If we have enough money- then fine. The Squids will do all the logging- so the disk will be accessed regularly.

...

I'm listening to your thought that maybe going with a 64 bit CPU might make more sense just from the point of view of more RAM potential in the future. It's just very hard today to predict whether that's the right path to take or not.

I'm not shure either. 4Gigs will carry a long way i guess (especially with the 'join' configuration of the Squids)- and accessing a few bigger files from disk won't be a performance problem. By the time a memory upgrade would be beneficial a full machine with that amount of ram (of possibly a different kind then) might cost no more. So x86 with 4Gigs might be fine as well- cpu-wise anyway.

...

One can only assume that RAM will be a lot cheaper in the future, and it sure would be nice to spend $1600 per machine to upgrade to 16Gb.

The question is if this is really needed that soon- if the load really explodes we could start adding mirrors which would also save some bandwidth. Once the envisoned setup is running the next step is mainly about a DNS thing like SuperSparrows. Squid3 is scheduled for release this spring- with ESI this can lower the load drastically again. __

Gabriel Wicke

Neil Harris

7 Jan 7 Jan

12:43 a.m.

Gabriel Wicke wrote:

...

On Mon, 05 Jan 2004 08:40:14 -0800, Jimmy Wales wrote:

...
In particular, I think that Gabriel's "artist's impression" of load balancer in front of squid in front of apache in front of mysql is the right thing to do.

Hi Jimmy,

i've looked at Squid's failover and load balancing capabilities a bit- the short story: we could do without any separate load balancer with slightly better performance and HA.
     Internet
         |-----------  Bind DNS round robin
  _______|______       (multiple A records)
 |              |
Squid Squid Squid Tier1 (connected with heartbeat) ___|______________|___ | | | Apache Apache Apache |___________|__________| | | MySQL MySQL

The Details about the Squid setup are on http://www.aulinx.de/oss/code/wikipedia/.

This would free resources to throw on MySQL.

Gabriel Wicke

The Apache and MySQL systems could all exist on the same switched network (even though the data flow would remain as above), making the system a bit simpler to build and administer, whilst still keeping advantage of having the Squid machines as a full proxy firewall for the back-end machines:

Switch speed should not be an issue: an Ethernet switched network will scale to Gbit speeds, at which time the Distributed Squid Borg would probably be a natural follow-on from Gabriel's design.

-- Neil

7649

Age (days ago)

7656

Last active (days ago)

wikitech-l@lists.wikimedia.org

49 comments

9 participants

tags (0)

participants (9)

Alfio Puglisi
Brion Vibber
Gabriel Wicke
Hr. Daniel Mikkelsen
Jens Frank
Jimmy Wales
Neil Harris
Ricky Beam
Rob Hooft