user_Jamesday wrote:
The Jeremy Zawodny book?:)
The very one :-)
Temporary table use by server, per second:
total on disk
Ariel 1.06 0.204
Suda 0.164 0.162
Bacon 0.13 0.13 (all)
So, temporary tables which use the disks aren't really that frequent.
Suda uses its smaller drive pair for temporary and log tables. The database servers have 6
hot swap bays. Taking one drive for temporary tables would cost two drives in the main
array. I don't think it's worth doing. I do recall some discussion with Silicon
Mechanics about an internal bay which can hold a SATA drive and adding a 36GB 10K RPM SATA
drive for temporary tables might be worth doing. Or for Suda, the 36GB drive not currently
in use. But they are infrequent enough that I'm not really worried.
OK.
All of the database servers have write buffering turned
on and it makes a significant difference. All are ordered with battery backup. Possible
exception is Bacon where I need to check to see if it has the battery and ask Jimbo to
order that module (about US$100) if not.
Cool.
The indexes are fair to good for most queries but not
good enough for the special pages. I'm deliberately waiting on suggesting big changes
there until after the next release of the software. Getting the schema changes for cur/old
is far more important and I don't want to distract people from them. See the meta
page. I plan to script the special page updates first to prove what is faster, then submit
those scripted versions for PHP implementation for the live site. This will involve
temporary tables and/or summary tables, to get perfect indexes and to avoid using long
duration read queries on cur.
A few queries force the right index. Generally MySQL gets it right - only one case where
it doesn't now. See the Querybane page for that one and for the queries which are
problematic enough to need special handling. The cur/old schema change and search query
deal with most.
We should of course wait for the new DB schema to be set up before
fiddling with the indices. :-)
What I meant was that currently, we have (for example) a (cur_title) and
a combined (cur_namespace,cur_title) index, last time I checked. A
single (cur_title,cur_namespace) index (note the order!) would be sorted
for cur_title and should suffice. That would save memory, for both RAM
and disk.
I just notice that we also have a
(cur_namespace,cur_title,inverse_timestamp) index. Triple indexing of
cur_title is a little overkill, IMHO.
One query which needs work is the category list. See
link below which explains the problems with it. Needs a periodically generated summary
table. Querycache can't currently do the job because it lacks a field for the initial
limit value.
Maybe we should just get rid of the alphabetical list display and
replace it with a tree-like display. That will get rid of the
LIMIT,OFFSET thing - we can just query for subcategories of a category,
using cur_namespace, cur_id , cl_from and cl_to.
Magnus