Uwe Baumbach wrote:
Robert Stojnic rainmansr@gmail.com wrote:
... Just a quick update, a couple of days ago all of this stuff got enabled on all WMF projects. ...
Is there a complete and reliable docu of the current search backend and frontend? In the past I have found a lot of pieces, sometimes not matching each other, so it is hard (for me) to see the current structure and really necessary parts to install it in our environment.
I'm not sure if we've got solid docs at this point, but a quick overview of the pieces:
The user interface front-end is built into MediaWiki's Special:Search.
The backend class which speaks to the Java-based Lucene search server is the MWSearch extension: http://www.mediawiki.org/wiki/Extension:MWSearch It provides a SearchEngine plugin class which fetches results from the separate server.
The current Java search server code is in this branch in SVN: http://svn.wikimedia.org/svnroot/mediawiki/branches/lucene-search-2.1/
You can browse the code here: http://svn.wikimedia.org/viewvc/mediawiki/branches/lucene-search-2.1/
In addition to the README file in the source there, there are some directions on the wiki: http://www.mediawiki.org/wiki/Extension:Lucene-search
I'm not sure how up to date those are.
The last I checked, the lucene-search package is still a little picky about how it runs, and may not run on Windows. (I remember having trouble on Mac OS X actually, some of the scripts assumed Linux-specific command-line tools.) This may have improved a bit though.
What about differential or incremental index update for lucene?
Incremental updates are currently being done by fetching lists of updated pages through the OAI-PMH interface provided in the OAIRepository extension: http://www.mediawiki.org/wiki/Extension:OAIRepository
This is also poorly documented for setup (sorry!) but should be pretty straightforward to set up if you check the code comments and set up the database tables.
-- brion