This requests goes out _especially_ to the 'hands dirty' developers who are intricately familiar with where our hardware bottlenecks are most likely to be. But anyone with solid, non-speculative ideas is welcome to comment!
As Jason said, he bought another gig of ram -- we are planning to basically max out the machines if that will help. That hard drive that didn't work, well, we'll get the right one soon, and that'll help significantly I think. Also, if I'm not mistaken, one of these machines has a dual processor board, but only a single processor, is that right, Jason? So that's an easy upgrade, once we find and buy the right part.
But, imagine this: suppose I were shopping at http://www.penguincomputing.com/, as I always do, and I wanted to spend around $3000 on new hardware for wikipedia. Presumably, this could either be one new kickass machine, or 2 new very nice frontend webservers, or whatever else we think we could best use.
Given that budget, what should I buy that's most likely to give us the best bang for the buck?
And then, of course, once we have settled on the general type of hardware (i.e. 2 medium machines or 1 kickass machine) that will be most useful to us, we could also extend our shopping to other suppliers. I have used penguincomputing many times and I've been very happy, but I'd go elsewhere if there was good reason.
--Jimbo
On Tue, Aug 12, 2003 at 05:25:26AM -0700, Jimmy Wales wrote:
This requests goes out _especially_ to the 'hands dirty' developers who are intricately familiar with where our hardware bottlenecks are most likely to be. But anyone with solid, non-speculative ideas is welcome to comment!
As Jason said, he bought another gig of ram -- we are planning to basically max out the machines if that will help. That hard drive that didn't work, well, we'll get the right one soon, and that'll help significantly I think. Also, if I'm not mistaken, one of these machines has a dual processor board, but only a single processor, is that right, Jason? So that's an easy upgrade, once we find and buy the right part.
But, imagine this: suppose I were shopping at http://www.penguincomputing.com/, as I always do, and I wanted to spend around $3000 on new hardware for wikipedia. Presumably, this could either be one new kickass machine, or 2 new very nice frontend webservers, or whatever else we think we could best use.
Given that budget, what should I buy that's most likely to give us the best bang for the buck?
And then, of course, once we have settled on the general type of hardware (i.e. 2 medium machines or 1 kickass machine) that will be most useful to us, we could also extend our shopping to other suppliers. I have used penguincomputing many times and I've been very happy, but I'd go elsewhere if there was good reason.
--Jimbo
Other than maxing the RAM in pliny, and adding a second processor to larousse, I don't think spending $3000 on hardware is a wise investment at this point. The hardware than it can buy would not help to provide a significant enough speed improvement for the cost.
However, if you are dead set on spending your money, this is how I see the greatest benefit:
Get a cheapish 1U dual CPU Athlon/Xeon system. It doesn't need to have a lot of memory or speedy disks. It does need reasonable processors, since it will be crunching on PHP all day. Make this our front end web server, and move the web serving off of both pliny and larousse. Don't bother upgrading larousse - instead, make this our mail server / backup server. Upgrade pliny (if possible) to 4GB of memory and some faster RAIDed hard disks. If within budget, you might want to upgrade pliny past the Athlon MP 1800s (I think), which is what it has now. (provided this is possible in our servers)
Any counter-opinions out there?
Estimates: 4x 1GB PC2100 for pliny: $800 2x Athlon MP 2.08 (2800) Barton for pliny: $500 2x ?GB 10k RPM SCSI disks: ? Dual rackmount 1U web server: ?
Nick Reinking wrote:
On Tue, Aug 12, 2003 at 05:25:26AM -0700, Jimmy Wales wrote:
This requests goes out _especially_ to the 'hands dirty' developers who are intricately familiar with where our hardware bottlenecks are most likely to be. But anyone with solid, non-speculative ideas is welcome to comment!
If I remember it right it should be: pliny => datebase and all wikis but en and de ...? larousse => webserver (php) for en and de
If I'm wrong, adapt the following or correct me writing in thes list ;)
I think Nick is right, getting a fast 'php' (Webserver) frontend for all the languages still running on pliny. And giving the database enought memory, and a fast RAID (striping is usually the version with fast read access, if I remember, feel free to correct me) will help to get a database-server only doing mysql and this hopefully at all his speed and memory.
We can in the future easily take a look at the load of the hosts and see what is slowing us down, the database server or the frontend hosts. And we can move languages from one frontend to the other to optimize the load (not all big wiki's on one host etc. ).
Thomas Corell wrote:
I think Nick is right, getting a fast 'php' (Webserver) frontend for all the languages still running on pliny. And giving the database enought memory, and a fast RAID (striping is usually the version with fast read access, if I remember, feel free to correct me) will help to get a database-server only doing mysql and this hopefully at all his speed and memory.
How much priority should I give to RAID versus RAM for the db server, if there's a tradeoff? Does anyone really know the answer to that?
--Jimbo
On Tue, Aug 12, 2003 at 01:40:15PM -0700, Jimmy Wales wrote:
Thomas Corell wrote:
I think Nick is right, getting a fast 'php' (Webserver) frontend for all the languages still running on pliny. And giving the database enought memory, and a fast RAID (striping is usually the version with fast read access, if I remember, feel free to correct me) will help to get a database-server only doing mysql and this hopefully at all his speed and memory.
How much priority should I give to RAID versus RAM for the db server, if there's a tradeoff? Does anyone really know the answer to that?
--Jimbo
RAM is more important than fast disks, but you really are going to want to have both. We're definitely not going to want to stripe these disks if we want reliability. Having two disks in RAID 0 is doing to double our chances of losing everything in a disk failure. We're going to probably want RAID 5, or RAID 10 at the worst.
Also, the reason why I gave my design is it is the cheapest way to get the most performance out of three machines. The other way (buying a machine to straight replace pliny) is certainly less work, but with considerably less performance benefit (unless you're willing to up the CPUs to dual Athlon 2800+ CPUs for an extra $314). RAID would also help, but that's going to cost an additional $1200 or so at the very least. In reality, we are likely going to need twice that $3000 to get a setup that is really going to perform appreciably better than our current setup.
Nick Reinking wrote:
In reality, we are likely going to need twice that $3000 to get a setup that is really going to perform appreciably better than our current setup.
I highly doubt that. But if that's true, then we should really be thinking of what we buy today as only an interim measure, to be followed in a month with more stuff bought by Paypal donations.
For $6,000 we could buy 3 very decent webserver-only machines and move everything off the existing DB machine and use load balancing. I *know* that would rock, if the DB machine was pumped up to 4 gig.
--Jimbo
Jimmy Wales wrote:
How much priority should I give to RAID versus RAM for the db server, if there's a tradeoff? Does anyone really know the answer to that?
All what I heard of mysql it likes to use RAM. So, primary target is RAM and secondary is RAID ( => 3 disks: 1 for root, logs etc. and 2 striping disks for DB-Data). Thats the information some guys given, after some discussion in a forum of a great german computer magazin (c't / heise).
So, if possible, the new to buy machine if it will be the new DB-Server, it should have at least 3 HD slots.
On Tue, 12 Aug 2003 23:08:15 +0200, Thomas Corell corell@corell.dyndns.org gave utterance to the following:
Jimmy Wales wrote:
How much priority should I give to RAID versus RAM for the db server, if there's a tradeoff? Does anyone really know the answer to that?
All what I heard of mysql it likes to use RAM. So, primary target is RAM and secondary is RAID ( => 3 disks: 1 for root, logs etc. and 2 striping disks for DB-Data). Thats the information some guys given, after some discussion in a forum of a great german computer magazin (c't / heise).
So, if possible, the new to buy machine if it will be the new DB-Server, it should have at least 3 HD slots.
One question to throw into the mix - If we cache static pages for anonymous (no skin preference)users (updating them when edited) rather than pulling them afresh from the database, on which mahine should the static pages reside - the web server, the db box or a dedicated box?
On Wed, 13 Aug 2003, Richard Grevers wrote:
One question to throw into the mix - If we cache static pages for anonymous (no skin preference)users (updating them when edited) rather than pulling them afresh from the database, on which mahine should the static pages reside - the web server, the db box or a dedicated box?
At present those are on the web server, in the local filesystem.
The (compressed) cache for the English Wikipedia presently weighs in at 790,348 kb.
-- brion vibber (brion @ pobox.com)
Richard Grevers wrote:
One question to throw into the mix - If we cache static pages for anonymous (no skin preference)users (updating them when edited) rather than pulling them afresh from the database, on which mahine should the static pages
I tried a middle way for susning.nu. The original UseModWiki source code (0.92) has an option for HTML caching, but the way it is implemented when any page is edited the entire cache is cleared, which doesn't really scale beyond 1000 pages.
So in my case, in subroutine WikiToHTML, instead of calling PageOrEditLink for every occurance of FreeLinkPattern, calls a new subroutine PageOrBracketLink, that returns an HTML <a href> link if the page exists but leavs the brackets in place if it doesn't exist. This half-cooked format is stored in the cache. The only wiki markup that remains to be evaluated on every access is the bracket links to pages that didn't exist when the page was last edited.
This assumes that pages are never deleted, which is true in my case.
This works fine, at least as fine as not using caching. I cannot really tell if it is any faster, but I don't think it's slower.
On Tue, Aug 12, 2003 at 05:25:26AM -0700, Jimmy Wales wrote:
This requests goes out _especially_ to the 'hands dirty' developers who are intricately familiar with where our hardware bottlenecks are most likely to be. But anyone with solid, non-speculative ideas is welcome to comment!
As Jason said, he bought another gig of ram -- we are planning to basically max out the machines if that will help. That hard drive that didn't work, well, we'll get the right one soon, and that'll help significantly I think. Also, if I'm not mistaken, one of these machines has a dual processor board, but only a single processor, is that right, Jason? So that's an easy upgrade, once we find and buy the right part.
But, imagine this: suppose I were shopping at http://www.penguincomputing.com/, as I always do, and I wanted to spend around $3000 on new hardware for wikipedia. Presumably, this could either be one new kickass machine, or 2 new very nice frontend webservers, or whatever else we think we could best use.
Given that budget, what should I buy that's most likely to give us the best bang for the buck?
And then, of course, once we have settled on the general type of hardware (i.e. 2 medium machines or 1 kickass machine) that will be most useful to us, we could also extend our shopping to other suppliers. I have used penguincomputing many times and I've been very happy, but I'd go elsewhere if there was good reason.
First a disclaimer, as nobody has done any serious measurement, what we can do is no more than an educated guess.
It seems that adding much more RAM (the fastest kind of RAM you can find) to all machines would help most. So that all the databases fit in RAM, and there's plenty space left for tcp/ip connection buffers, per-process PHP memory etc.
Current database design is very centralized, so single kickass machine would probably help more.
First a disclaimer, as nobody has done any serious measurement, what we can do is no more than an educated guess.
It seems that adding much more RAM (the fastest kind of RAM you can find) to all machines would help most. So that all the databases fit in RAM, and there's plenty space left for tcp/ip connection buffers, per-process PHP memory etc.
Current database design is very centralized, so single kickass machine would probably help more.
Well, I do agree that a single kickass machine would probably help the most, you really can't get something that kickass for $3000, especially in a rackmountable format. I think it will be much cheaper and more cost effective upgrading pliny, if possible. $3000 will just not buy you a dual Athlon 2800/4GB DDR/2x 72GB SCSI disk rackmountable machine.
Nick Reinking wrote:
Well, I do agree that a single kickass machine would probably help the most, you really can't get something that kickass for $3000, especially in a rackmountable format. I think it will be much cheaper and more cost effective upgrading pliny, if possible. $3000 will just not buy you a dual Athlon 2800/4GB DDR/2x 72GB SCSI disk rackmountable machine.
No, but if you're willing to settle for a dual Athlon 2200/4GB DDR/2x 36GB SCSI disk rackmountable machine, the price is $3276 from PenguinComputing, and likely less elsewhere.
So, that's pretty good, huh?
On Tue, Aug 12, 2003 at 09:34:24AM -0700, Jimmy Wales wrote:
Nick Reinking wrote:
Well, I do agree that a single kickass machine would probably help the most, you really can't get something that kickass for $3000, especially in a rackmountable format. I think it will be much cheaper and more cost effective upgrading pliny, if possible. $3000 will just not buy you a dual Athlon 2800/4GB DDR/2x 72GB SCSI disk rackmountable machine.
No, but if you're willing to settle for a dual Athlon 2200/4GB DDR/2x 36GB SCSI disk rackmountable machine, the price is $3276 from PenguinComputing, and likely less elsewhere.
So, that's pretty good, huh?
It is reasonable, and you can't get much better for cheaper elsewhere, but you're really looking at (I'm assuming you're talking about the Atlus 140 server with:
Dual Athlon 2200+ CPUs + 4GB PC2100 DDR memory + 2x 36GB SCA hard drives - No warranty - No rackmount kit - No gigabit ethernet (this would be very nice...)
This runs $3280 for me. I believe there will also be sales tax added to this price (for being in CA). I'm not sure if it is worth that much money for what adds up to being 20% faster processors, 2GB of memory extra, and an extra 36GB hard drive.
Here is what I think the best way to go would be: Relion 130 server with: 2x Xeon 2.4GHz 512MB PC2100 (2x256MB) + 2x 40GB IDE hard drives + Free gigabit ethernet - No warranty - No rackmount kit
This server runs $1954, and I think would make a great replacement for larousse. Now, assuming that pliny can be upgraded, I'd get (prices from NewEgg):
2x Athlon MP 2800 Barton: $500 4x 1GB PC2100 ECC: $1200 36GB 15k RPM SCSI hard disk: $300 Gigabit ethernet card: $50
Now, assuming pliny currently has ECC memory in it, we move the 2GB currently in it to the new machine.
That way, we end up with:
pliny (database server only): 2x Athlon MP 2800 4GB PC2100 DDR 2x 36GB SCSI disks (1x 15000, 1x 10000) Gigabit ethernet
larousse (web server only): 2x Xeon 2.4GHz 2GB PC2100 DDR 2x 40GB IDE disks Gigabit ethernet
newserver (mail + backup + misc): 1x PIII 866MHz 2GB PC133 1x 36GB SCSI disk 10/100 ethernet
I'd then directly connect pliny and larousse over gigabit ethernet with a crossover cable. It's more expensive than $4000, but I think this configuration would at least make a significant performance difference.
Nick Reinking wrote:
[..]
newserver (mail + backup + misc): 1x PIII 866MHz 2GB PC133 1x 36GB SCSI disk 10/100 ethernet
I'd then directly connect pliny and larousse over gigabit ethernet with a crossover cable. It's more expensive than $4000, but I think this configuration would at least make a significant performance difference.
Looks very nice. And you can move some very small wikipedias to the 'newserver', which can handle them without problems, IMHO.
somebody clued me in to http://www.pricewatch.com/ -- supposedly a nexus for OEM overstock liqidators -- they seem to have decent prices. (in case anybody didnt know )
-S-
--- Nick Reinking nick@twoevils.org wrote:
First a disclaimer, as nobody has done any serious
measurement,
what we can do is no more than an educated guess.
It seems that adding much more RAM (the fastest
kind of RAM you can find)
to all machines would help most. So that all the
databases fit in RAM,
and there's plenty space left for tcp/ip
connection buffers,
per-process PHP memory etc.
Current database design is very centralized, so
single kickass machine
would probably help more.
Well, I do agree that a single kickass machine would probably help the most, you really can't get something that kickass for $3000, especially in a rackmountable format. I think it will be much cheaper and more cost effective upgrading pliny, if possible. $3000 will just not buy you a dual Athlon 2800/4GB DDR/2x 72GB SCSI disk rackmountable machine.
-- Nick Reinking -- eschewing obfuscation since 1981 -- Minneapolis, MN _______________________________________________ Wikitech-l mailing list Wikitech-l@wikipedia.org
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
__________________________________ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com
Jimmy Wales wrote:
As Jason said, he bought another gig of ram -- we are planning to basically max out the machines if that will help. That hard drive that didn't work, well, we'll get the right one soon, and that'll help significantly I think.
Oops... I forgot to tell everyone. I went back the next day and got the right hard drive. 36 Gig is installed now. I think Brion set up the partitions/filesystem(s) as well.
Oops... I forgot to tell everyone. I went back the next day and got the right hard drive. 36 Gig is installed now. I think Brion set up the partitions/filesystem(s) as well.
Jason could you post for everyone all the current specs on both machines, as best as we know them, as well as what each machine is doing? This would probably help us to plan for a reconfiguration in the event of a new server (or two) purchase.
--Jimbo
Jimmy Wales wrote:
Jason could you post for everyone all the current specs on both machines, as best as we know them, as well as what each machine is doing? This would probably help us to plan for a reconfiguration in the event of a new server (or two) purchase.
Pliny - Running MySQL, Apache 1.3.27 and kernel 2.4.20 and the non-English Wikis
Tyan S2462 motherboard 2 Gig RAM (4 512Meg modules of ECC Reg. 266 MHz PC2100 DDR RAM in 4 slots) 2 Processor AMD Athlon MP 1800+/266FSB Adaptec AIC-7899P SCSI U160 controller 1 IBM 36 GB SCSI Ultra 160 / 10 K RPM 1 IBM 36 GB SCSI Ultra 320 / 10 K RPM (running as U160, of course) Dual onboard 3Com 10/100 (3C982) adapters
Larousse - Running Apache 1.3.27 and kernel 2.4.20 and the English Wiki
TYAN S2518UGN motherboard 2Gig RAM (4 512Meg modules of ECC Reg. 133 MHz PC133 SDRAM in 4 slots) Pentium III 866MHz processor w/256K cache Adaptec AIC-7899 SCSI U160 controller 1 Quantum Atlas 18 GB SCSI U160 Drive, currently unused and known to be troublesome (It's awaiting some tsting which I have put off due to the high load on the system) 1 IBM 36 GB SCSI U160 Drive Dual onboard Intel 10/100 (eepro) adapters
Just a few notes:
Pliny crashes roughly once a month. This must stop.
Not only are the outage periods annoying, but every crash risks corrupting the database. Hardware problems have been vaguely suspected, but we've never really had the chance to swap pieces out until it stops crashing.
At this point I would tend to suggest that we: * replace pliny as the database server with a slightly beefier machine (eg, the Altus 130 or 140, with the max of 4 gig ram) * take the old hard drive _out_ of pliny and set it aside, as it's been suspected of being a problem * reassign pliny as the new web server front end * reassign larousse as the backup server; it could hold both a replicated database and take over web server duties in a crash emergency by taking over the other machine's IP address
The present larousse is overloaded (typical load average around 8-10 during daytime US, or up to 30 if some ass decides to spider the site for every printable page), and a lot of pliny's CPU is taken by its additional web duties (pushing load to around 4).
More RAM for the database should help it.
More CPU for the web server would definitely help it.
More RAM for the web server should help it once we start making more use of in-memory caching, which should decrease load on the database (and decrease the amount of stuff that it has to lock, leading to faster response times on reads while other operations are running).
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
- replace pliny as the database server with a slightly beefier machine
(eg, the Altus 130 or 140, with the max of 4 gig ram)
- take the old hard drive _out_ of pliny and set it aside, as it's been
suspected of being a problem
- reassign pliny as the new web server front end
- reassign larousse as the backup server; it could hold both a
replicated database and take over web server duties in a crash emergency by taking over the other machine's IP address
One thing I like about this plan is that most of it doesn't involve changing the way things are working.
pliny (dual Athlon 1800) can surely do more than larousse (Pentium III), so that will likely solve the overloaded webserver problem. Going to a slightly beefier db server will probably help a lot, too.
Under your setup, presumably the non-English wiki webserving could be handled by either pliny (dual Athlon 1800) or by the backup (Pentium III) server? If pliny can do it, that's great, but if not, well, the backup server should be plenty enough to handle that for now.
The present larousse is overloaded (typical load average around 8-10 during daytime US, or up to 30 if some ass decides to spider the site for every printable page), and a lot of pliny's CPU is taken by its additional web duties (pushing load to around 4).
So in addition to beefy up the db server, we want to also not have it doing additional web duties at all, right?
More RAM for the database should help it.
More CPU for the web server would definitely help it.
More RAM for the web server should help it once we start making more use of in-memory caching, which should decrease load on the database (and decrease the amount of stuff that it has to lock, leading to faster response times on reads while other operations are running).
Excellent.
I notice that the Penguin Dual MP Altus is $1700 with 4GB is another 1500
Pricewatch.com lists: $233 Athlon MP 2800 ...x2=$466 $403 Dual MP server (rack) $170 1GB DDR memory ...x4=$680
just an idea. -S-
--- Brion Vibber brion@pobox.com wrote:
Just a few notes:
Pliny crashes roughly once a month. This must stop.
Not only are the outage periods annoying, but every crash risks corrupting the database. Hardware problems have been vaguely suspected, but we've never really had the chance to swap pieces out until it stops crashing.
At this point I would tend to suggest that we:
- replace pliny as the database server with a
slightly beefier machine (eg, the Altus 130 or 140, with the max of 4 gig ram)
- take the old hard drive _out_ of pliny and set it
aside, as it's been suspected of being a problem
- reassign pliny as the new web server front end
- reassign larousse as the backup server; it could
hold both a replicated database and take over web server duties in a crash emergency by taking over the other machine's IP address
The present larousse is overloaded (typical load average around 8-10 during daytime US, or up to 30 if some ass decides to spider the site for every printable page), and a lot of pliny's CPU is taken by its additional web duties (pushing load to around 4).
More RAM for the database should help it.
More CPU for the web server would definitely help it.
More RAM for the web server should help it once we start making more use of in-memory caching, which should decrease load on the database (and decrease the amount of stuff that it has to lock, leading to faster response times on reads while other operations are running).
-- brion vibber (brion @ pobox.com)
Wikitech-l mailing list Wikitech-l@wikipedia.org
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
__________________________________ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com
I hear you. Once we settle on what's the best bang for the buck given the options at Penguin (which helps focus the discussion), then we can also shop around for the best possible deal.
Steve Vertigo wrote:
I notice that the Penguin Dual MP Altus is $1700 with 4GB is another 1500
Pricewatch.com lists: $233 Athlon MP 2800 ...x2=$466 $403 Dual MP server (rack) $170 1GB DDR memory ...x4=$680
just an idea. -S-
--- Brion Vibber brion@pobox.com wrote:
Just a few notes:
Pliny crashes roughly once a month. This must stop.
Not only are the outage periods annoying, but every crash risks corrupting the database. Hardware problems have been vaguely suspected, but we've never really had the chance to swap pieces out until it stops crashing.
At this point I would tend to suggest that we:
- replace pliny as the database server with a
slightly beefier machine (eg, the Altus 130 or 140, with the max of 4 gig ram)
- take the old hard drive _out_ of pliny and set it
aside, as it's been suspected of being a problem
- reassign pliny as the new web server front end
- reassign larousse as the backup server; it could
hold both a replicated database and take over web server duties in a crash emergency by taking over the other machine's IP address
The present larousse is overloaded (typical load average around 8-10 during daytime US, or up to 30 if some ass decides to spider the site for every printable page), and a lot of pliny's CPU is taken by its additional web duties (pushing load to around 4).
More RAM for the database should help it.
More CPU for the web server would definitely help it.
More RAM for the web server should help it once we start making more use of in-memory caching, which should decrease load on the database (and decrease the amount of stuff that it has to lock, leading to faster response times on reads while other operations are running).
-- brion vibber (brion @ pobox.com)
Wikitech-l mailing list Wikitech-l@wikipedia.org
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com _______________________________________________ Wikitech-l mailing list Wikitech-l@wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Brion Vibber wrote:
(eg, the Altus 130 or 140, with the max of 4 gig ram)
Altus 130 is Dual Channel ATA-100, (2 drive bays) Altus 140 is U160 SCSI (3 drive bays)
I have *always* used SCSI for servers, but is this just superstition?
I'm narrowing it down to these three:
Altus 130 Dual Athlon 2800+, 4 gig DDR, 2 80 gig EIDE drives $3006
Altus 140 Dual Athlon 2800+, 4 gig DDR, 2 36 gig SCSI drives $3590
Altus 1000 Dual Opteron 240 (sweet!), 4 gig DDR, 2 80 gig EIDE drives $3526
As you can see, within the budget range of potential tradeoffs, it looks like SCSI costs $500 more for half the disk space.
On Tue, Aug 12, 2003 at 01:56:19PM -0700, Jimmy Wales wrote:
Brion Vibber wrote:
(eg, the Altus 130 or 140, with the max of 4 gig ram)
Altus 130 is Dual Channel ATA-100, (2 drive bays) Altus 140 is U160 SCSI (3 drive bays)
I have *always* used SCSI for servers, but is this just superstition?
I'm narrowing it down to these three:
Altus 130 Dual Athlon 2800+, 4 gig DDR, 2 80 gig EIDE drives $3006
Altus 140 Dual Athlon 2800+, 4 gig DDR, 2 36 gig SCSI drives $3590
Altus 1000 Dual Opteron 240 (sweet!), 4 gig DDR, 2 80 gig EIDE drives $3526
As you can see, within the budget range of potential tradeoffs, it looks like SCSI costs $500 more for half the disk space.
No, it isn't superstition if you want to have good, reliable, fast RAID on a server. There are IDE solutions out there but the vast majority of them, well, suck. If you are willing to spend more than your $3000 target, it might be wised to get the 1000E w/ 4GB (4x1GB) + 2x 120GB IDE disks (these are newer disks than the 80GB). That gives us considerable upgrade potential in the future (as AMD phases out Athlon MPs), as well as allowing us to move up to 8GB of memory in the future. If worse comes to worse, you can always add a PCI SCSI card and gets some external SCSI storage in the future.
Or, if you're willing to mess around, go to a single 40GB IDE disk on the 1000E and buy a couple of Western Digital 10k RPM Raptor hard drives from another vendor. They are SCSI size (36GB), SCSI speed (10k RPM), but for a lot lower price (only $136 at newegg.com); which is almost as cheap as the much slower, older disks from Penguin Computing.
Jimmy-
But, imagine this: suppose I were shopping at http://www.penguincomputing.com/, as I always do, and I wanted to spend around $3000 on new hardware for wikipedia.
Thank you for continuing to support Wiki[mp]edia with substantial amounts of your own money. I agree with Brion's proposal -- our current main problems are the slowness of Larousse and the unreliability of Pliny (the cause of the latter, however, remains unknown).
Beyond investments in hardware, I hope we will also consider hiring a crack MySQL hacker if/when we can afford it, to whack our database structure, DB server setup and SQL queries into shape. There's still room for optimization, and it would be nice to get some of the Special Pages back. On the PHP front, stuff like a new parser (much needed) or memcached integration (huge potential benefits) would come to fruition faster if we could compensate MediaWiki developers at least to a certain extent. This should be part of the Wikimedia planning.
Regards,
Erik
wikitech-l@lists.wikimedia.org