Hey,
thanks for your answer.
I think the problem is that my hunspell dictionary is not working
correctly with compound words. I've mailed to
pgsql-general(a)postgresql.org and we are working on my problem.
I will inform you when I've got it working.
Jens
2011/2/7 Greg Sabino Mullane <greg(a)endpoint.com>om>:
I've
installed a fresh MediaWiki using Mediawiki 1.16.2
and PostgreSQL 8.4.7.
...
extract from the Main Page:
"Therapieempfehlungen" (german)
If I search "Therapieempfehlungen" I got as result the Main Page. But
if I search "Therapie" MediaWiki cannot found this phrase
...
Postgres' full text system uses word stemming rather than exact matches.
The first thing you'd have to do is ensure that you are using 'german' as
the language, so Postgres knows how to split the words. The second problem
is that even "therapie" won't work, as it's not part of
"Therapieempfehlungen": the German root is "therapieempfehl".
It's extremely
impractical and expensive to match against every single substring of a
word, so full text systems use stemming and other tricks. Here's what it
looks like under the hood:
# select to_tsquery('german', 'Therapieempfehlungen');
to_tsquery
-------------------
'therapieempfehl'
# select to_tsquery('german', 'Chemotherapie');
to_tsquery
----------------
'chemotherapi'
# select to_tsquery('english', 'Therapieempfehlungen');
to_tsquery
------------------------
'therapieempfehlungen'
# select to_tsquery('english', 'puppeteer');
to_tsquery
------------
'puppet'
(1 row)
As far as making sure your tsearch is using german, you want to
change your default config to German for the MediaWiki user,
which is usually mwuser. This can be done like so:
ALTER USER mwuser SET default_text_search_config = 'german';
--
Greg Sabino Mullane greg(a)endpoint.com
End Point Corporation
PGP Key: 0x14964AC8