1. Add url parameters that override each of those options for easy experimenting.Stuff we could do really, really easily:These are the options we use for the more_like_this query:
$wgCirrusSearchMoreLikeThisConfig = array(
'min_doc_freq' => 2, // Minimum number of documents (per shard) that need a term for it to be considered
'max_query_terms' => 25,
'min_term_freq' => 2,
'percent_terms_to_match' => 0.3,
'min_word_len' => 0,
'max_word_len' => 0,
);Here is the reference for what they mean and any more we might be able to set: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-mlt-query.htmlWe only use the "text" field of the articles - no weighting based on, well, anything. See the text field in https://en.wikipedia.org/wiki/Barack_Obama?action=cirrusdump for example.2. Add url parameters to use different fields like our weighted all field, the wikitext, or intro paragraphs (don't ask how we extract into paragraphs - its a horrible hack), or the section headers, or the "secondary" text like the inforboxes and image subtitles.These are seriously very little work. A couple of hours. A day if we're being really good about testing _and_ someone merges something to core that screws up the tests. If it enables lots of cool experimenting I'm all for doing it.NikOn Tue, Jun 2, 2015 at 9:54 AM, Dan Garry <dgarry@wikimedia.org> wrote:_______________________________________________On 1 June 2015 at 23:07, Bernd Sitzmann <bernd@wikimedia.org> wrote:The few terms I've tried it on morelike: search prefix produced better Read more articles than our old wayFunny, I kind of found the opposite! So, I suggest running a test.You could increment the MobileWikiAppArticleSuggestions schema, removing the "version" field (since it's redundant now anyway) and adding a "suggestionsSource" field. Make a copy of SuggestionsTask which uses the new method to generate results. Bucket users 50/50, half of them getting the old method for suggestions and half of them getting the new method. Transmit which version they got in the "suggestionsSource" field. Run analysis to determine which gets users to engage more, then go with that way! This would make a nice quarterly goal for next quarter, I think. :-)Thanks,Dan--Dan GarryProduct Manager, Search and DiscoveryWikimedia Foundation
Wikimedia-search mailing list
Wikimedia-search@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
_______________________________________________
Wikimedia-search mailing list
Wikimedia-search@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search