On Thu, 24 Apr 2008, Robert Stojnic wrote:
Bryan Tong Minh wrote:
On Wed, Apr 23, 2008 at 11:06 PM, Robert Stojnic rainmansr@gmail.com
wrote:
What remains unsolved, however, is keeping the index updated with the latest changes on the site. If one changes a template with a category in it, the thing goes on the job queue. I assume there would need to be some kind of hook that will either log the change somewhere or send data to lucene somehow. This is the part of the backend that needs thinking and solving.
LinksUpdateComplete hook?
Something like that, yes, but hook probably couldn't just connect to the lucene indexer and queue updates, since the indexer might be down for this or that reason... It might be a better solution to put updates into some table with date attached and then let the indexer fetch updates.
Yes, absolutely what I was saying - use the hook to write to a table. In core, the table is MyISAM and fulltext indexed, for big wikis, it's innodb and Lucene pulls data out of it on a separately defined schedule to build/update the index.
How about a simple table like: * pageid - what's this, an int? * categories - a text field * lastchange - a datetime (stamped by mysql on update)
Then, whenever the lucene thing runs, it just looks for records changed since the last index update (as you said, Robert).
Aerik
wikitech-l@lists.wikimedia.org