Hi,
Sorry if this message does show up in the original thread (I subscribed
to this ml after the original message was posted)
It's always extremely complex to debug relevancy problems and it depends
on many factors.
It looks like the word "share" is quite common in your corpus so it is
possible that your problem is due to the "all" field.
The all field is a performance hack that allows cirrus to query a single
field (see
) but it has some
drawbacks with common words.
Could you try to add &cirrusUseAllFields=no to the search results URL
and see if it affects the ranking?
The URL should be :
Note that if you use the all field :
$wgCirrusSearchAllFields = array( 'build' => true, 'use' => true );
and if you change weights in wgCirrusSearchWeights you'll have to
re-index your data.
Another test you could try is to disable rescore, rescore is a feature
that will reorder the top-N results (8192 by default). You can limit the
rescore impact by adding the following URL parameters
&cirrusFunctionWindow=1&cirrusPhraseWindow=1 .
If nothing helps could you share the result of
-
From: *Daniel Barrett* <danb(a)cimpress.com
<mailto:danb@cimpress.com>>
Date: Mon, Oct 12, 2015 at 9:57 AM
Subject: [MediaWiki-l] Help debugging CirrusSearch problems?
To: MediaWiki announcements and site admin list
<mediawiki-l(a)lists.wikimedia.org <mailto:mediawiki-l@lists.wikimedia.org>>
We installed CirrusSearch recently to replace Lucene, and the results
it returns from wiki searches seem wildly irrelevant at times.
For example, our wiki (200,000+ titles) has a number of pages that
include the word "share" in the title. But when I search for "share"
using CirrusSearch, none of these pages come up in the top 100 hits.
The first one is hit #120. The #1 hit has "share" only in the names of
two categories.
As a pathological example, I created a wiki page named "Share share
share share share" and filled it with the word "share" over and over.
When I search the wiki for "share", my page appears as hit number
750! (If I search for "share share", my page comes up first.)
We're using the default CirrusSearch.php configuration except for the
following overrides in LocalSettings.php:
$wgCirrusSearchWeights = array(
'title' => 20, // default 20
'redirect' => 15, // default 15
'category' => 8, // default 8
'heading' => 4, // default 5
'opening_text' => 3, // default 3
'text' => 1, // default 1
'auxiliary_text' => 0.5, // default 0.5
'file_text' => 0.5, // default 0.5
);
// Prevent some custom namespaces from showing in search results
$wgCirrusSearchNamespaceWeights = array(
NS_VP_1 => 0,
NS_VP_2 => 0,
NS_VP_3 => 0,
);
$wgCirrusSearchDefaultNamespaceWeight = 0.5;
$wgCirrusSearchPowerSpecialRandom = false;
Any advice on how to debug these weird search results?
Thank you very much,
DanB
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l