[Wikipedia-l] rate of change

Jimmy Wales jwales at bomis.com
Sat Jan 26 21:06:13 UTC 2002


Lars Aronsson wrote:
> Would it be reasonable to update the search index each time a new
> version of a page is saved?  In that case, the search would still be
> indexed (and fast), but it would always be up-to-date.

This is true now, since the pages are in a true database with Magnus's new
software.  In the old version, all the data was just stored in text files
on disk.  I wrote a program to go through and analyze the keywords from all
the pages and titles, and construct a search index from that.

I always wanted to put it on a cron job to update nightly, but it was
so inefficient that I didn't feel comfortable letting it run without
supervision, and I didn't feel comfortable running it all that often.

Now that everything is in a real database, it should be true that with a 
little playing around and tweaking, we can get decent results that are
fast and also always instantly updated.

The current version is a very simple SQL query.  It doesn't work so well in
terms of being intelligent about returning what you probably want.



More information about the Wikipedia-l mailing list