On 7/2/07, Brion Vibber <brion(a)wikimedia.org> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Emufarmers Sangly wrote:
On 6/29/07, Brion Vibber
<brion(a)wikimedia.org> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Emufarmers Sangly wrote:
I've set up Lucene following the instructions
on
http://meta.wikimedia.org/wiki/Installing_lucene_search When I get to
the
indexing stage, I get this error:
Unhandled Exception: java.io.IOException: no root element: U+58
You might want to
confirm that your XML dump file is ok.
How can I do this? We took the dumps two weeks apart, so I don't see
how
there could be a problem unless there's a
problem in the database
itself, or
with the dumping maintenance tool.
Try looking at the file.
Is it properly-formatted XML?
Or does it have, say, CGI headers at the start of the file?
Or is it compressed?
Or does it have a big error message?
Or something else?
- -- brion vibber (brion @
wikimedia.org)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla -
http://enigmail.mozdev.org
iD8DBQFGiRoswRnhpk1wk44RAm3aAKDQek+K5zS4kZJ309U9vJiFBNZ17wCfdyGg
bV9a2NgmalA8mOOX/igzI94=
=euMN
-----END PGP SIGNATURE-----
Oh, yes, I got it sorted a couple days ago: The
server was adding headers to
the dumps, so I used -q for getting them (sorry for not posting back
sooner).
The issue I'm presently grappling with is how to get a fresh index every 24
hours and have the daemon recognize it. A cronjob and a script get the
index and import it okay, but it seems as though I need to restart the
daemon for it to use the new index. Right now I just have the script do:
killall mono (just doing killall MWDaemon didn't kill all the necessary
processes)
MWDaemon
Of course, this probably isn't the best solution, since it would cause
problems in the unlikely event I ever run something else with mono. I see
that there's some sort of update daemon included with the package, but I'm
not sure how to run it properly. I'm also trying to find if there's some
sort of shutdown signal that the daemon will accept over GET.