Just wanted to make sure this response made it to the public list.
Looks like I was wrong to suggest this wasn't an API question. :-)
-------- Original Message --------
Subject: Re: [Mediawiki-api] highlighting and stemming in wikipedia
Date: Thu, 15 Aug 2013 08:19:07 -0400
From: Nikolas Everett <neverett(a)wikimedia.org>
To: Sumana Harihareswara <sumanah(a)wikimedia.org>
CC: Ewa Szwed <ewaszynal(a)gmail.com>
I can totally help!
So we use a home grown lucene wrapper at this point but are in the process
of deploying elasticsearch. Both of them power search + highlighting +
"did you mean" all at the same time. Both of them have per language
configuration/customization to return usefully stemmed results for the
language being searched. For example, the new search we're working on
flattens accented characters but only when searching English wikis.
Rather than reimplement it I suggest scraping it from the api if you can.
It looks like the simplest way to get our highlighting from the api is to
ask for it like this:
If there is something wrong with the API or our search/highlighting then
I'd be happy to work with you on fixing it.
On Thu, Aug 15, 2013 at 1:05 AM, Sumana Harihareswara <sumanah(a)wikimedia.org
On 08/12/2013 05:07 AM, Ewa Szwed wrote to MediaWiki
API announcements &
Currently I am working on a part of a wikimirror
project that involves
doing a search in wikipedia and retrieving snippet of text with searched
terms being highlighted.
I would like to improve what I am doing at the moment by highlighting
different forms of a given word.
For example if user types in: 'playing' I would also highlight play,
played and so on.
In wikipedia this already works like this:
I would like to ask what technics (plugins,
wikipedia uses to do this type of highlighting.
I would appreciate any answer.
Ewa Szwed, thanks for your question. In the future you should probably
try the Wikimedia developers' and general MediaWiki list if you have a
search-related question: <wikitech-l(a)lists.wikimedia.org>
I'm also cc'ing Nik Everett, the lead for Wikimedia search-related
questions. I hope he can help. Thanks.
Engineering Community Manager