On Jul 18, 2008, at 16:03, Marc Cousin
<mcousin(a)sigma.fr>
EXPLAIN ANALYZE SELECT page_id, page_namespace,
page_title, old_text
AS page_text, ts_rank(titlevector, to_tsquery('default','postgres'))
AS rnk FROM page p, revision r,
pagecontent c WHERE p.page_latest = r.rev_id AND r.rev_text_id =
c.old_id AND textvector @@ to_tsquery('default','postgres') AND
page_is_redirect = '0' AND page_namespace IN
(0,9,11) ORDER BY rnk DESC, page_id DESC LIMIT 20 OFFSET 0;
Hmm; a fundamental problem here is that text.old_text is opaque to the
database -- it may contain compressed text, text in an alternate
encoding, or a reference to another table or database at the
application level.
This is part of why we use a separate table for the search index for
the MySQL search -- it's actually consistently readable to the db's
search engine as well as providing a summary table so the index only
has to cover live versions.
-- brion
Hi,
Thanks for the answer...
What do you mean by 'opaque' ?
Btw the separate table for mysql solves the problem I'm having with postgresql
(except that there is not the myisam problem, so the fulltext index could be
stored directly in the 'page' table with postgresql). For a wiki like mine,
with a lot of versions per document, text search performance becomes very
bad.
So can you explain me what you mean by opaque ?
Thanks a lot for your time.
Marc