Just wanted to make sure this response made it to the public list. Looks like I was wrong to suggest this wasn't an API question. :-) -Sumana
-------- Original Message -------- Subject: Re: [Mediawiki-api] highlighting and stemming in wikipedia search results Date: Thu, 15 Aug 2013 08:19:07 -0400 From: Nikolas Everett neverett@wikimedia.org To: Sumana Harihareswara sumanah@wikimedia.org CC: Ewa Szwed ewaszynal@gmail.com
I can totally help!
So we use a home grown lucene wrapper at this point but are in the process of deploying elasticsearch. Both of them power search + highlighting + "did you mean" all at the same time. Both of them have per language configuration/customization to return usefully stemmed results for the language being searched. For example, the new search we're working on flattens accented characters but only when searching English wikis.
Rather than reimplement it I suggest scraping it from the api if you can. It looks like the simplest way to get our highlighting from the api is to ask for it like this: http://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=...
If there is something wrong with the API or our search/highlighting then I'd be happy to work with you on fixing it.
Nik
On Thu, Aug 15, 2013 at 1:05 AM, Sumana Harihareswara <sumanah@wikimedia.org
wrote:
On 08/12/2013 05:07 AM, Ewa Szwed wrote to MediaWiki API announcements & discussion mediawiki-api@lists.wikimedia.org:
Currently I am working on a part of a wikimirror project that involves doing a search in wikipedia and retrieving snippet of text with searched terms being highlighted. I would like to improve what I am doing at the moment by highlighting
also
different forms of a given word. For example if user types in: 'playing' I would also highlight play,
plays,
played and so on. In wikipedia this already works like this:
https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=defa...
I would like to ask what technics (plugins, libraries, extensions) wikipedia uses to do this type of highlighting. I would appreciate any answer.
Ewa Szwed, thanks for your question. In the future you should probably try the Wikimedia developers' and general MediaWiki list if you have a search-related question: wikitech-l@lists.wikimedia.org
I'm also cc'ing Nik Everett, the lead for Wikimedia search-related questions. I hope he can help. Thanks. -- Sumana Harihareswara Engineering Community Manager Wikimedia Foundation
mediawiki-api@lists.wikimedia.org