Hi Gautham,
I think mining wiktionary is an interesting project. However, about the
more practical Lucene part: at some point I tried using wordnet to
expand queries however I found that it introduces too many false
positives. The most challenging part I think it *context-based*
expansion. I.e. a simple synonym-based expansion is of no use because it
introduces too many meanings that the user didn't quite have in mind.
However, if we could somehow use the words in the query to find a
meaning from a set of possible meanings that could be really helpful.
You can look into existing lucene-search source to see how I used
wordnet. I think in the end I ended up using it only for very obvious
stuff (e.g. 11 = eleven, UK = United Kingdom, etc..).
Cheers, r.
On 06/04/12 01:58, Gautham Shankar wrote:
Hello,
Based on the feedback i received i have updated my proposal page.
https://www.mediawiki.org/wiki/User:Gautham_shankar/Gsoc
There is about 20 Hrs for the deadline and any final feedback would be
useful.
I have also submitted the proposal at the GSOC page.
Regards,
Gautham Shankar
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l