[Mediawiki-l] MediaWiki + Lucene-Search2 + MWSearch extension = ZERO search results

agent dale cooper agentdcooper at gmail.com
Tue Jul 1 14:34:12 UTC 2008


>
> Subject:
> Re: [Mediawiki-l] MediaWiki + Lucene-Search2 + MWSearch extension = 
> ZERO search results
> From:
> Tim Starling <tstarling at wikimedia.org>
> Date:
> Tue, 01 Jul 2008 08:02:05 +1000
>
>
> It sounds like you've isolated the problem to within a couple of 
> hundred lines of code. Maybe you should spend less time searching the 
> web for someone with your exact problem, and more time reading that code.
=)     I'd agree with ya, if I wasn't so much of a PHP newbie... I'd 
consider myself more of a Perl and Bash type coder, but I definarely 
understand where you are coming from with your suggestion. Luckily, I 
found someone over @ MediaWiki.org's MWSearch Extension_talk page that 
helped me troubleshoot my issue!

>> Follow me here :: if I load up the URL in the debug log above (or
>> *everytime* I search now and read the debug log) in a web-browser, like
>> 'lynx' it I see this (or something similar) ;
>>
>> 1
>> 1.0 0 Main_Page
>
> Is this the same response text that MWSearch sees? If yes, where does 
> MWSearch go wrong in interpreting it? If no, what is different about 
> the way MWSearch requests pages compared to lynx? Is it timing out? 
> You can use tcpdump to snoop on the communication between MWSearch and 
> the search server. You can use telnet to generate requests manually 
> and see how the search daemon responds.
>
> -- Tim Starling 
Here's what "Brian" from MWSearch Extension_talk page helped identify, 
summing up his last post, and the results we found from some 
troubleshooting ;

"we can conclude from this that: 1) PHP can connect to Lucene properly 
and 2) Your HTTP fetch capabilities are broken. I'm not sure what we can 
do about it. The proper way is of course to fix the HTTP functions, but 
I don't know how we can do that. The other option is to write a new HTTP 
layer which will surely work."

<(root@/var/www/htdocs/wiki-svn06252008)> cd /var/www/htdocs/wiki-svn06252008
<(root@/var/www/htdocs/wiki-svn06252008)> php maintenance/eval.php
> $sock = fsockopen('127.0.0.1', 8123); fwrite($sock, "GET /search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 HTTP/1.0\r\nHost: localhost\r\n\r\n"); print fread($sock, 8192);
HTTP/1.1 200 OK
Content-Type: text/plain

1
1.0 0 Main_Page


 > print 
Http::get('http://127.0.0.1:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10' 
<http://127.0.0.1:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10%27>);
 >
 > print 
Http::get('http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10' 
<http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10%27>);
 >


What's the chance some kind soul on the Mediawiki-l mailing list knows, 
or can point me where I can figure out more in-depth information  about 
MediaWiki's HTTP get function that may be causing my querries to my 
Lucene-Search-2 daemon on port 8123 to get stripped out?

When using PHP to talk directly to my LuceneSearch2 daemon I get a valid 
response, and everything works great = the response is displayed, as 
search results. The problem comes into play within my MediaWiki site 
once I enable the MWSearch extension (ZERO search results), or as seen 
above = when I start-up MediaWiki's PHP debug script and try to use 
HTTP::get to talk with the LuceneSeach2 Daemon, I get no response so it 
seems... but my LS2 daemon is definately responding to the HTTP::get 
request! It sounds like MediaWiki is the culprit and MW's HTTP fetch 
function is somehow stripping the search results --- as demonstrated 
above. I can also get the search results from my LS2 daemon with a web 
browser "lynx", telnet or with PHP.

I really hope someone can point me in the right direction, or help a 
fella' out with diagnosing the issue! Thanks for your time, peace -

agentdcooper at gmail.com


More information about the MediaWiki-l mailing list