On Jan 12, 2004, at 16:07, Nick Hill wrote:
From what I have gathered, the cost (limiting factor to performance) is that of delays seeking fine grained data. Either this seek load will need to be spread across many mechanical devices such that the work is not unduly duplicated, or store the fine grained data in solid state storage so that it can be seeked quickly.
As a reminder, Geoffrin (the opteron box) is *perfectly fine* at this and handles the database load admirably. It's just out of service and replaced by a box (Ursula) with a hideously slow drive at the moment because that's what was available to get back online with.
The medium-term plan is simply to get Geoffrin back online, and to get _some_ machine with decently fast drives to serve as a replicated hot backup.
The ideas about squid caches etc are not about lightening the load on the database server (which it wouldn't really do except insofar as it may cache more stuff than our present on-web-server caching), but about lightening and spreading out the load on the web servers. Squid caches will *not* help the immediate question here to a significant degree; anyway no more than making slight alterations to the present caching code to avoid checking timestamps in some cases would.
Our present alternatives for database duty are:
* Pliny, which has done it in the past. Exhibiting intermittent errors on primary drive, and crashed a couple times when it ran the database again in late December, which is why we took it back off to Ursula.
* Carol (currently idle) with a SCSI drive that's too small for the whole database.
* Susan (currently idle) with another IDE drive that's likely not the fastest.
Unless somebody's got a clearer suggestion, I'll be following JeLuF's advice and in a few hours moving some of the more heavily trafficked European languages to one of these spare boxes to attempt to split the load on Ursula's poor drive.
-- brion vibber (brion @ pobox.com)