On Tue, Jul 1, 2008 at 12:02 PM, Roan Kattouw <roan.kattouw(a)home.nl> wrote:
I had the idea joining on integers would be faster, I
guess I
underestimated how much. I didn't know about the primary key thing, but
it makes sense (it's called *primary* key for a reason).
The primary key benefits are specific to InnoDB, since it clusters the
table data in the primary key (basically the table data is in the
leaves of the primary key B-tree, if I understand right). A primary
key lookup is therefore one B-tree lookup instead of two. In MyISAM,
and in many other DBMSes, the primary key isn't special.
With the schema I was backing, yes. However, schema #1
doesn't eliminate
the need to check for a category's redirect target when adding a page to
it, and since joining on cat_page=page_id is faster than joining on
cat_title=page_title (because of the int vs. varchar and primary vs.
non-primary issues), that would constitute a "specific performance
benefit" for adding cat_page, wouldn't it?
Not a big enough one. Adding an extra column means every row is that
much larger, reducing key buffer efficiency and thereby hurting
performance slightly for all queries on the table. And if you're
going to join on it you also may need an extra index (depending on
join direction), which takes time to maintain on every insert and
delete and also competes for the cache. Plus you get the headache of
denormalization.
In this case it's almost certainly not worth it. In the case of
cl_final, where you're avoiding cripplingly large filesorts, it's
definitely worth it, because otherwise the feature is completely
untenable.