Hi,
Nik Everett added a lot of information about future CirrusSearch changes to the next edition of Tech News (#26, due to be sent on Monday). (Thank you, Nik :)
We can't include all of it in the newsletter, so below is the original text with all the details. The newsletter will point to this email for further information.
_______________________________________________________________
- CirrusSearch updates (with 1.24wmf10)
- Categories will now be considered in result ranking which should improve results
-
We took a shortcut to get this deployed (much) more quickly and the consequences are that the incategory operator won't work for up to 24 hours after the deployment. We'll make this time as short as possible. If this is going to be a horrible pain then file a bug against Cirrus with the wiki you work on. We can either prioritize your wiki so the outage is very small or, if its a big enough deal, come up with a workaround.
- Text from the lead paragraph in the article will be given a boost when ranking results which should also improve results
- This will take some time to roll onto the wikis after wmf10 because the index will have to be rebuilt. Days, likely.
- I don't imagine this'll have any impact on wiktionary and commons but file a bug against Cirrus if it seems like it has a negative impact on results
- We're on track to add support for searching in article source including regular expressions. See [the documentation] for more.
- Like the lead paragraph the article source will take some time to roll into the index after the deployment.
- Right now we haven't implemented snippet extraction from article source searches. You'll only get snippets back from the regular search terms. If you don't have any regular search terms you'll get back a snippet from the beginning of the article. I know this isn't ideal at all, and its on the list of things to fix.
- We'll cut all wikis over to a new snippet extractor
- You should only notice improvements in the snippets generated but if you see any trouble file a bug against Cirrus
--
Guillaume Paumier
Technical Communications Manager — Wikimedia Foundation