On 7/2/07, Brion Vibber brion@wikimedia.org wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Emufarmers Sangly wrote:
On 6/29/07, Brion Vibber brion@wikimedia.org wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Emufarmers Sangly wrote:
I've set up Lucene following the instructions on http://meta.wikimedia.org/wiki/Installing_lucene_search When I get to
the
indexing stage, I get this error: Unhandled Exception: java.io.IOException: no root element: U+58
You might want to confirm that your XML dump file is ok.
How can I do this? We took the dumps two weeks apart, so I don't see
how
there could be a problem unless there's a problem in the database
itself, or
with the dumping maintenance tool.
Try looking at the file.
Is it properly-formatted XML?
Or does it have, say, CGI headers at the start of the file?
Or is it compressed?
Or does it have a big error message?
Or something else?
- -- brion vibber (brion @ wikimedia.org)
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGiRoswRnhpk1wk44RAm3aAKDQek+K5zS4kZJ309U9vJiFBNZ17wCfdyGg bV9a2NgmalA8mOOX/igzI94= =euMN -----END PGP SIGNATURE-----
Oh, yes, I got it sorted a couple days ago: The server was adding headers to the dumps, so I used -q for getting them (sorry for not posting back sooner).
The issue I'm presently grappling with is how to get a fresh index every 24 hours and have the daemon recognize it. A cronjob and a script get the index and import it okay, but it seems as though I need to restart the daemon for it to use the new index. Right now I just have the script do: killall mono (just doing killall MWDaemon didn't kill all the necessary processes) MWDaemon
Of course, this probably isn't the best solution, since it would cause problems in the unlikely event I ever run something else with mono. I see that there's some sort of update daemon included with the package, but I'm not sure how to run it properly. I'm also trying to find if there's some sort of shutdown signal that the daemon will accept over GET.