I'm trying to set up a more regular search index update for the Wikimedia sites. To summarize how it's to work:
A process on maurus (the search build master) runs through the list of all wikis, dumping their text and piping it to the search index builder program.
As each wiki completes, the newly built index is moved from the build directory into the complete directory.
The lucene search servers currently restart themselves hourly as a precaution against memory leaks; additionally before restart they will now do an rsync to copy over any complete new indexes from the master. [There may be some refinements to make on this, such as keeping 'live' and 'update' copies and swapping them out during the restart.]
This should keep search index updates happening within a day or two, rather than the extremely long and irregular schedule of before.
Currently the build process is running on maurus since last night; it's currently about 1/3 through enwiki and doesn't appear to have spewed any errors. I'll check in on it again this evening and if things look ok I'll set up the synchronization process.
-- brion vibber (brion @ pobox.com)