On Sun, Nov 22, 2009 at 02:45, Aryeh Gregor
<Simetrical+wikilist(a)gmail.com> wrote:
Yeah, pretty much. If you had an index on
(page_is_redirect,
page_namespace, page_title) or such, that would speed it up
significantly (at least in MySQL). It would have to scan through the
whole index, but that only contains three columns plus a row id of
some kind, so it should be quite a lot faster than scanning the whole
table. But this isn't a likely query for optimization.
That's correct. We run this query once, and then we do
a SELECT for each title (which only takes a few millis,
because it uses an index) and work with the result.
Building the index would probably take longer than
5 or 10 minutes.
The DB access is fast compared to all the other stuff
we do, so there's not much need to optimize it. MySQL
with the indexes in tables.sql suits us fine so far, and I
have no reason to suspect any other DB would be faster.
The one thing that is slow is builiding the indexes after
the data has been imported (eight hours or so). Maybe
we could omit some indexes that are not used in our
application, but I haven't really looked into that.
Christopher