I've set up Lucene following the instructions on http://meta.wikimedia.org/wiki/Installing_lucene_search When I get to the indexing stage, I get this error: Unhandled Exception: java.io.IOException: no root element: U+58 at org.mediawiki.importer.XmlDumpReader.readDump () [0x00000] at MediaWiki.Search.SearchTool.SearchTool.ImportDump (System.Stringdumpfile, System.String database) [0x00000] at MediaWiki.Search.SearchTool.SearchTool.Main (System.String[] args) [0x00000] My friend has also set up Lucene on his machine, and he gets the same error. We're both at a loss about what the problem is.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Emufarmers Sangly wrote:
I've set up Lucene following the instructions on http://meta.wikimedia.org/wiki/Installing_lucene_search When I get to the indexing stage, I get this error: Unhandled Exception: java.io.IOException: no root element: U+58
You might want to confirm that your XML dump file is ok.
- -- brion vibber (brion @ wikimedia.org)
On 6/29/07, Brion Vibber brion@wikimedia.org wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Emufarmers Sangly wrote:
I've set up Lucene following the instructions on http://meta.wikimedia.org/wiki/Installing_lucene_search When I get to
the
indexing stage, I get this error: Unhandled Exception: java.io.IOException: no root element: U+58
You might want to confirm that your XML dump file is ok.
How can I do this? We took the dumps two weeks apart, so I don't see how there could be a problem unless there's a problem in the database itself, or with the dumping maintenance tool.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Emufarmers Sangly wrote:
On 6/29/07, Brion Vibber brion@wikimedia.org wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Emufarmers Sangly wrote:
I've set up Lucene following the instructions on http://meta.wikimedia.org/wiki/Installing_lucene_search When I get to
the
indexing stage, I get this error: Unhandled Exception: java.io.IOException: no root element: U+58
You might want to confirm that your XML dump file is ok.
How can I do this? We took the dumps two weeks apart, so I don't see how there could be a problem unless there's a problem in the database itself, or with the dumping maintenance tool.
Try looking at the file.
Is it properly-formatted XML?
Or does it have, say, CGI headers at the start of the file?
Or is it compressed?
Or does it have a big error message?
Or something else?
- -- brion vibber (brion @ wikimedia.org)
On 7/2/07, Brion Vibber brion@wikimedia.org wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Emufarmers Sangly wrote:
On 6/29/07, Brion Vibber brion@wikimedia.org wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Emufarmers Sangly wrote:
I've set up Lucene following the instructions on http://meta.wikimedia.org/wiki/Installing_lucene_search When I get to
the
indexing stage, I get this error: Unhandled Exception: java.io.IOException: no root element: U+58
You might want to confirm that your XML dump file is ok.
How can I do this? We took the dumps two weeks apart, so I don't see
how
there could be a problem unless there's a problem in the database
itself, or
with the dumping maintenance tool.
Try looking at the file.
Is it properly-formatted XML?
Or does it have, say, CGI headers at the start of the file?
Or is it compressed?
Or does it have a big error message?
Or something else?
- -- brion vibber (brion @ wikimedia.org)
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGiRoswRnhpk1wk44RAm3aAKDQek+K5zS4kZJ309U9vJiFBNZ17wCfdyGg bV9a2NgmalA8mOOX/igzI94= =euMN -----END PGP SIGNATURE-----
Oh, yes, I got it sorted a couple days ago: The server was adding headers to the dumps, so I used -q for getting them (sorry for not posting back sooner).
The issue I'm presently grappling with is how to get a fresh index every 24 hours and have the daemon recognize it. A cronjob and a script get the index and import it okay, but it seems as though I need to restart the daemon for it to use the new index. Right now I just have the script do: killall mono (just doing killall MWDaemon didn't kill all the necessary processes) MWDaemon
Of course, this probably isn't the best solution, since it would cause problems in the unlikely event I ever run something else with mono. I see that there's some sort of update daemon included with the package, but I'm not sure how to run it properly. I'm also trying to find if there's some sort of shutdown signal that the daemon will accept over GET.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Emufarmers Sangly wrote:
The issue I'm presently grappling with is how to get a fresh index every 24 hours and have the daemon recognize it. A cronjob and a script get the index and import it okay, but it seems as though I need to restart the daemon for it to use the new index.
That's right.
Right now I just have the script do: killall mono (just doing killall MWDaemon didn't kill all the necessary processes)
/etc/init.d/mwsearch restart
Assuming you installed the init scripts.
- -- brion vibber (brion @ wikimedia.org)
On 7/2/07, Brion Vibber brion@wikimedia.org wrote:
Right now I just have the script do: killall mono (just doing killall MWDaemon didn't kill all the necessary processes)
/etc/init.d/mwsearch restart
Assuming you installed the init scripts.
It doesn't appear that I did. Where can I get them/find instructions for setting them up?
I just noticed the Extension:LuceneSearch (looks like somebody added it to the Lucene page just yesterday); should I be using this instead of mwsearch?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Emufarmers Sangly wrote:
On 7/2/07, Brion Vibber brion@wikimedia.org wrote:
Right now I just have the script do: killall mono (just doing killall MWDaemon didn't kill all the necessary processes)
/etc/init.d/mwsearch restart
Assuming you installed the init scripts.
It doesn't appear that I did. Where can I get them/find instructions for setting them up?
make install
I just noticed the Extension:LuceneSearch (looks like somebody added it to the Lucene page just yesterday); should I be using this instead of mwsearch?
I'd recommend giving the new one a try.
- -- brion vibber (brion @ wikimedia.org)
mediawiki-l@lists.wikimedia.org