On 13/04/06, Jakob Voss <jakob.voss(a)nichtich.de> wrote:
Search engines don't update their search index
live with every new item.
The problem with Wikipedia is its size and the quick changes. Normally
you would generate a new index every week or night - and to generate a
search index for millions of records takes hours! A powerful MediaWiki
search engine with a time lag of 1 to 2 days would also be fine for me -
you could also think of a smart search engine that works on an old dump
in the first run and checks on the live database in the second.
That would be more than fine. I gather the search db is currently
several months out of date? But that wasn't my major complaint.
To get such a powerful search it's better to build
it from the scratch
in an independent application instead of coding it into MediaWiki (but
I'm no MediaWiki developer so I may be wrong) so you can optimize for
Well, it should be as easily accessible as the search box is now.
SELECT page_id FROM page WHERE page_title RLIKE $regxp
That would be nice, but even the simple mechanism of exact matches
would be a start. And then you can add fall backs, like all upper
case, all lower case, upper case first letter of each word and so on.
If performance is the issue here.