Our MWsearch stopped updating a while ago. I'm wondering if one of Apple's Java updates caused a problem.
Release notes say the backend is it's Lucene Sarch 2.1.3. Checking the Java version:
$ java -version java version "1.6.0_51" Java(TM) SE Runtime Environment (build 1.6.0_51-b11-457-10M4509) Java HotSpot(TM) 64-Bit Server VM (build 20.51-b01-457, mixed mode)
When I try to do
sudo sh update
from the Lucene-search directory, I get a bunch of messages and then:
14420 [main] WARN org.wikimedia.lsearch.interoperability.RMIMessengerClient - Error invoking remote method enqueueFrontend() on host Hexamer : error marshalling arguments; nested exception is: java.net.SocketException: Broken pipe java.rmi.MarshalException: error marshalling arguments; nested exception is: java.net.SocketException: Broken pipe at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:138) at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(RemoteObjectInvocationHandler.java:178) at java.rmi.server.RemoteObjectInvocationHandler.invoke(RemoteObjectInvocationHandler.java:132) at com.sun.proxy.$Proxy0.enqueueFrontend(Unknown Source) at org.wikimedia.lsearch.interoperability.RMIMessengerClient.enqueueFrontend(RMIMessengerClient.java:183) at org.wikimedia.lsearch.oai.IncrementalUpdater.main(IncrementalUpdater.java:214) Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109) at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1864) at java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1902) at java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1563) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:332) at sun.rmi.server.UnicastRef.marshalValue(UnicastRef.java:274) at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:133) ... 5 more 14421 [main] WARN org.wikimedia.lsearch.oai.IncrementalUpdater - Error sending index update records of colipedia to indexer at Hexamer java.rmi.MarshalException: error marshalling arguments; nested exception is: java.net.SocketException: Broken pipe at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:138) at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(RemoteObjectInvocationHandler.java:178) at java.rmi.server.RemoteObjectInvocationHandler.invoke(RemoteObjectInvocationHandler.java:132) at com.sun.proxy.$Proxy0.enqueueFrontend(Unknown Source) at org.wikimedia.lsearch.interoperability.RMIMessengerClient.enqueueFrontend(RMIMessengerClient.java:183) at org.wikimedia.lsearch.oai.IncrementalUpdater.main(IncrementalUpdater.java:214) Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109) at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1864) at java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1902) at java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1563) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:332) at sun.rmi.server.UnicastRef.marshalValue(UnicastRef.java:274) at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:133) ... 5 more
It worked before, but the person who helped set it up has left my group. Java problem? conf problem? Any help would be appreciated.
Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
On Wed, Nov 6, 2013 at 7:04 PM, Jim Hu jimhu@tamu.edu wrote:
Our MWsearch stopped updating a while ago. I'm wondering if one of Apple's Java updates caused a problem.
I'm work on the system that will replaceMWsearch/LuceneSearch so I have to try to sell it to you first: https://www.mediawiki.org/wiki/Extension:CirrusSearch
Now that that is out of the way I'll try to do what I can to help you fix what you have.
First: Apple's Java has grown famous over the past few years for not being quite right. I don't think that is the problem, but it might be.
Release notes say the backend is it's Lucene Sarch 2.1.3. Checking the Java version:
$ java -version java version "1.6.0_51" Java(TM) SE Runtime Environment (build 1.6.0_51-b11-457-10M4509) Java HotSpot(TM) 64-Bit Server VM (build 20.51-b01-457, mixed mode)
We're running $ java -version java version "1.6.0_38" Java(TM) SE Runtime Environment (build 1.6.0_38-b05) Java HotSpot(TM) 64-Bit Server VM (build 20.13-b02, mixed mode)
with LuceneSearch, for reference. There is no reason why the newer one shouldn't work as well.
When I try to do
sudo sh update
from the Lucene-search directory, I get a bunch of messages and then:
14420 [main] WARN org.wikimedia.lsearch.interoperability.RMIMessengerClient - Error invoking remote method enqueueFrontend() on host Hexamer : error marshalling arguments; nested exception is: java.net.SocketException: Broken pipe
<snip>
What does the other side say about the error? Is it logging any complaints?
It worked before, but the person who helped set it up has left my group. Java problem? conf problem? Any help would be appreciated
Are you running the same Java and LuceneSearch version on the client machine and the server machine? I'm not super familiar with that part of LuceneSearch but this might be important.
Nik
Hi Nik,
As I was reading the docs for MWSearch, I considered whether I should switch to CirrusSearch, so it may not be a difficult sell. I'd even volunteer to try to update the documentation if you're willing to help walk me through it.
But to show how clueless I am... I'm not sure how to check the other end, since I'm not clear on what it's trying to do. Here's my undoubtedly deeply flawed understanding of what happens (this reflects that I'm a biologist by training and badly self-taught on wikis and linux/unix/osx).
I'm assuming that the problem is in this first step of the update script
java -cp LuceneSearch.jar org.wikimedia.lsearch.oai.IncrementalUpdater -l $@ \
It's listing a bunch of update items (the ... in my first post). I am guessing that it pulls info on revisions from the mysql database and converts them to some format that gets sent to the indexer, which I assume is part of apache Lucene. From the error, it's failing to pass that through some socket to the indexer. But I don't know how to see a log for activity on that socket.
My similarly uninformed reading about CirrusSearch is that it uses elasticsearch, which in turn uses Lucene. So if the problem is between the incrementalUpdater and Lucene, I might have similar issues with CirrusSearch. But if CirrusSearch gives more informative errors, that would help!! And maybe I should switch anyway, as it sounds like support for MWsearch will go away at some point.
Jim
On Nov 7, 2013, at 7:37 AM, Nikolas Everett wrote:
On Wed, Nov 6, 2013 at 7:04 PM, Jim Hu jimhu@tamu.edu wrote:
Our MWsearch stopped updating a while ago. I'm wondering if one of Apple's Java updates caused a problem.
I'm work on the system that will replaceMWsearch/LuceneSearch so I have to try to sell it to you first: https://www.mediawiki.org/wiki/Extension:CirrusSearch
Now that that is out of the way I'll try to do what I can to help you fix what you have.
First: Apple's Java has grown famous over the past few years for not being quite right. I don't think that is the problem, but it might be.
Release notes say the backend is it's Lucene Sarch 2.1.3. Checking the Java version:
$ java -version java version "1.6.0_51" Java(TM) SE Runtime Environment (build 1.6.0_51-b11-457-10M4509) Java HotSpot(TM) 64-Bit Server VM (build 20.51-b01-457, mixed mode)
We're running $ java -version java version "1.6.0_38" Java(TM) SE Runtime Environment (build 1.6.0_38-b05) Java HotSpot(TM) 64-Bit Server VM (build 20.13-b02, mixed mode)
with LuceneSearch, for reference. There is no reason why the newer one shouldn't work as well.
When I try to do
sudo sh update
from the Lucene-search directory, I get a bunch of messages and then:
14420 [main] WARN org.wikimedia.lsearch.interoperability.RMIMessengerClient - Error invoking remote method enqueueFrontend() on host Hexamer : error marshalling arguments; nested exception is: java.net.SocketException: Broken pipe
<snip>
What does the other side say about the error? Is it logging any complaints?
It worked before, but the person who helped set it up has left my group. Java problem? conf problem? Any help would be appreciated
Are you running the same Java and LuceneSearch version on the client machine and the server machine? I'm not super familiar with that part of LuceneSearch but this might be important.
Nik _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
On Thu, Nov 7, 2013 at 8:37 AM, Jim Hu jimhu@tamu.edu wrote:
My similarly uninformed reading about CirrusSearch is that it uses elasticsearch, which in turn uses Lucene. So if the problem is between the incrementalUpdater and Lucene, I might have similar issues with CirrusSearch. But if CirrusSearch gives more informative errors, that would help!! And maybe I should switch anyway, as it sounds like support for MWsearch will go away at some point.
While they both use Lucene, it's ultimately apples and oranges with regard to your problem. Elasticsearch/CirrusSearch doesn't use the same incremental updater written for lsearchd/MWSearch so I don't think it'll be a problem anymore.
I do most of my development work on OSX (and I'm the other main developer on Cirrus with Nik), so you can rest assured that Cirrus and Elastic work just fine on OSX :)
-Chad
On Thu, Nov 7, 2013 at 11:37 AM, Jim Hu jimhu@tamu.edu wrote:
Hi Nik,
As I was reading the docs for MWSearch, I considered whether I should switch to CirrusSearch, so it may not be a difficult sell. I'd even volunteer to try to update the documentation if you're willing to help walk me through it.
But to show how clueless I am... I'm not sure how to check the other end, since I'm not clear on what it's trying to do. Here's my undoubtedly deeply flawed understanding of what happens (this reflects that I'm a biologist by training and badly self-taught on wikis and linux/unix/osx).
I'm assuming that the problem is in this first step of the update script
java -cp LuceneSearch.jar org.wikimedia.lsearch.oai.IncrementalUpdater -l $@ \
It's listing a bunch of update items (the ... in my first post). I am guessing that it pulls info on revisions from the mysql database and converts them to some format that gets sent to the indexer, which I assume is part of apache Lucene. From the error, it's failing to pass that through some socket to the indexer. But I don't know how to see a log for activity on that socket.
You have the right idea but by "the other side" I mean a log on the indexer. It is some other java process probably running on the Hexamer host that I saw in the indexer logs. It should have something in the logs. Hopefully.
My similarly uninformed reading about CirrusSearch is that it uses elasticsearch, which in turn uses Lucene. So if the problem is between the incrementalUpdater and Lucene, I might have similar issues with CirrusSearch. But if CirrusSearch gives more informative errors, that would help!! And maybe I should switch anyway, as it sounds like support for MWsearch will go away at some point.
Lucene is a library that can be embedded in Java applications to provide full text searching capabilities (and geospatial search and few other things). Anyway, LuceneSearch is a Mediawiki specific application that provides Lucene's full text search capabilities in a way that the MWSearch extension understands.
Elasticsearch serves the same purpose for CirrusSearch as LuceneSearch serves for MWSearch. We like Elasticsearch because it is general purpose and sees a ton more development than LuceneSearch.
As far as support goes - we haven't done much with LuceneSearch/MWSearch in a while. I work on CirrusSearch every day, as does Chad who seems to have replied while I'm sending this email. Elasticsearch itself has had 44 people submit code to it in the past month. Its a more healthy ecosystem but it might be a pain to switch. CirrusSearch requires a very recent version of Mediawiki, for example.
Nik
Thanks for the replies, Nik and Chad. Sounds like I should switch.
Is 1.21.2 recent enough? I'm going to try this on a development server.
On Nov 7, 2013, at 10:50 AM, Nikolas Everett wrote:
On Thu, Nov 7, 2013 at 11:37 AM, Jim Hu jimhu@tamu.edu wrote:
Hi Nik,
As I was reading the docs for MWSearch, I considered whether I should switch to CirrusSearch, so it may not be a difficult sell. I'd even volunteer to try to update the documentation if you're willing to help walk me through it.
But to show how clueless I am... I'm not sure how to check the other end, since I'm not clear on what it's trying to do. Here's my undoubtedly deeply flawed understanding of what happens (this reflects that I'm a biologist by training and badly self-taught on wikis and linux/unix/osx).
I'm assuming that the problem is in this first step of the update script
java -cp LuceneSearch.jar org.wikimedia.lsearch.oai.IncrementalUpdater -l $@ \
It's listing a bunch of update items (the ... in my first post). I am guessing that it pulls info on revisions from the mysql database and converts them to some format that gets sent to the indexer, which I assume is part of apache Lucene. From the error, it's failing to pass that through some socket to the indexer. But I don't know how to see a log for activity on that socket.
You have the right idea but by "the other side" I mean a log on the indexer. It is some other java process probably running on the Hexamer host that I saw in the indexer logs. It should have something in the logs. Hopefully.
My similarly uninformed reading about CirrusSearch is that it uses elasticsearch, which in turn uses Lucene. So if the problem is between the incrementalUpdater and Lucene, I might have similar issues with CirrusSearch. But if CirrusSearch gives more informative errors, that would help!! And maybe I should switch anyway, as it sounds like support for MWsearch will go away at some point.
Lucene is a library that can be embedded in Java applications to provide full text searching capabilities (and geospatial search and few other things). Anyway, LuceneSearch is a Mediawiki specific application that provides Lucene's full text search capabilities in a way that the MWSearch extension understands.
Elasticsearch serves the same purpose for CirrusSearch as LuceneSearch serves for MWSearch. We like Elasticsearch because it is general purpose and sees a ton more development than LuceneSearch.
As far as support goes - we haven't done much with LuceneSearch/MWSearch in a while. I work on CirrusSearch every day, as does Chad who seems to have replied while I'm sending this email. Elasticsearch itself has had 44 people submit code to it in the past month. Its a more healthy ecosystem but it might be a pain to switch. CirrusSearch requires a very recent version of Mediawiki, for example.
Nik _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
On Thu, Nov 7, 2013 at 10:05 AM, Jim Hu jimhu@tamu.edu wrote:
Thanks for the replies, Nik and Chad. Sounds like I should switch.
Is 1.21.2 recent enough? I'm going to try this on a development server.
The upcoming 1.22 release *should* suffice.
-Chad
Doh... should have read the infobox where it says 1.22+...
On Nov 7, 2013, at 12:26 PM, Chad wrote:
On Thu, Nov 7, 2013 at 10:05 AM, Jim Hu jimhu@tamu.edu wrote:
Thanks for the replies, Nik and Chad. Sounds like I should switch.
Is 1.21.2 recent enough? I'm going to try this on a development server.
The upcoming 1.22 release *should* suffice.
-Chad _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
mediawiki-l@lists.wikimedia.org