I have a small Mediawiki (around 6000 pages) and currently using MWSearch(recent) and Lucene (2.1) and have a problem with Lucene where I get "java.io.FileNotFoundException:.../segments_u (No such file or directory)"; I have created a scripted solution around this but its slightly inefficient to rebuild my indexes THAT often as this happens 3-6 times a day. It seems a few people have posted about this issue on the Discussion portion of the Lucene page but no traction, is this issue just rare and caused by some incorrect configuration on my part? or do other wiki's have this issue as well? I am guessing the main Wikipedia is using the 2.1 branch of Lucene as it has "Did you mean" functionality which appears to be only apart of the 2.1 tree, and if this is true how do they deal with it? Any advice is appreciated, thanks in advance!
Mediawiki (1.16.2) PHP MySQL
Hi Zach,
Yes this is a known issue when using the ./build script with cron. WMF uses incremental updates which don't have such problems. I couldn't reproduce the problem once when I looked into it, I can only imagine it has something to do with previous build processes leaving the files in an inconsistent state. Some people have solved this problem by adding a rm -rf /path/to/your/index into cron before running the build script.
Cheers, Robert
On 30/07/12 06:53, Zach H. wrote:
I have a small Mediawiki (around 6000 pages) and currently using MWSearch(recent) and Lucene (2.1) and have a problem with Lucene where I get "java.io.FileNotFoundException:.../segments_u (No such file or directory)"; I have created a scripted solution around this but its slightly inefficient to rebuild my indexes THAT often as this happens 3-6 times a day. It seems a few people have posted about this issue on the Discussion portion of the Lucene page but no traction, is this issue just rare and caused by some incorrect configuration on my part? or do other wiki's have this issue as well? I am guessing the main Wikipedia is using the 2.1 branch of Lucene as it has "Did you mean" functionality which appears to be only apart of the 2.1 tree, and if this is true how do they deal with it? Any advice is appreciated, thanks in advance!
Mediawiki (1.16.2) PHP MySQL _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
By incremental, does this mean OAI?
On Jul 30, 2012, at 8:34 AM, Robert Stojnic rainmansr@gmail.com wrote:
Hi Zach,
Yes this is a known issue when using the ./build script with cron. WMF uses incremental updates which don't have such problems. I couldn't reproduce the problem once when I looked into it, I can only imagine it has something to do with previous build processes leaving the files in an inconsistent state. Some people have solved this problem by adding a rm -rf /path/to/your/index into cron before running the build script.
Cheers, Robert
On 30/07/12 06:53, Zach H. wrote:
I have a small Mediawiki (around 6000 pages) and currently using MWSearch(recent) and Lucene (2.1) and have a problem with Lucene where I get "java.io.FileNotFoundException:.../segments_u (No such file or directory)"; I have created a scripted solution around this but its slightly inefficient to rebuild my indexes THAT often as this happens 3-6 times a day. It seems a few people have posted about this issue on the Discussion portion of the Lucene page but no traction, is this issue just rare and caused by some incorrect configuration on my part? or do other wiki's have this issue as well? I am guessing the main Wikipedia is using the 2.1 branch of Lucene as it has "Did you mean" functionality which appears to be only apart of the 2.1 tree, and if this is true how do they deal with it? Any advice is appreciated, thanks in advance!
Mediawiki (1.16.2) PHP MySQL _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Currently we are running OAI and using the "update" script supplied with lucene, is their another method outside of this? On Mon, Jul 30, 2012 at 02:34:46PM +0100, Robert Stojnic wrote:
Hi Zach,
Yes this is a known issue when using the ./build script with cron. WMF uses incremental updates which don't have such problems. I couldn't reproduce the problem once when I looked into it, I can only imagine it has something to do with previous build processes leaving the files in an inconsistent state. Some people have solved this problem by adding a rm -rf /path/to/your/index into cron before running the build script.
Cheers, Robert
On 30/07/12 06:53, Zach H. wrote:
I have a small Mediawiki (around 6000 pages) and currently using MWSearch(recent) and Lucene (2.1) and have a problem with Lucene where I get "java.io.FileNotFoundException:.../segments_u (No such file or directory)"; I have created a scripted solution around this but its slightly inefficient to rebuild my indexes THAT often as this happens 3-6 times a day. It seems a few people have posted about this issue on the Discussion portion of the Lucene page but no traction, is this issue just rare and caused by some incorrect configuration on my part? or do other wiki's have this issue as well? I am guessing the main Wikipedia is using the 2.1 branch of Lucene as it has "Did you mean" functionality which appears to be only apart of the 2.1 tree, and if this is true how do they deal with it? Any advice is appreciated, thanks in advance!
Mediawiki (1.16.2) PHP MySQL _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Hi Zach,
No, WMF is using a version of that script as well. Not sure what is wrong in that case.
Cheers, r.
On 30/07/12 18:16, Zach Hilliard wrote:
Currently we are running OAI and using the "update" script supplied with lucene, is their another method outside of this? On Mon, Jul 30, 2012 at 02:34:46PM +0100, Robert Stojnic wrote:
Hi Zach,
Yes this is a known issue when using the ./build script with cron. WMF uses incremental updates which don't have such problems. I couldn't reproduce the problem once when I looked into it, I can only imagine it has something to do with previous build processes leaving the files in an inconsistent state. Some people have solved this problem by adding a rm -rf /path/to/your/index into cron before running the build script.
Cheers, Robert
On 30/07/12 06:53, Zach H. wrote:
I have a small Mediawiki (around 6000 pages) and currently using MWSearch(recent) and Lucene (2.1) and have a problem with Lucene where I get "java.io.FileNotFoundException:.../segments_u (No such file or directory)"; I have created a scripted solution around this but its slightly inefficient to rebuild my indexes THAT often as this happens 3-6 times a day. It seems a few people have posted about this issue on the Discussion portion of the Lucene page but no traction, is this issue just rare and caused by some incorrect configuration on my part? or do other wiki's have this issue as well? I am guessing the main Wikipedia is using the 2.1 branch of Lucene as it has "Did you mean" functionality which appears to be only apart of the 2.1 tree, and if this is true how do they deal with it? Any advice is appreciated, thanks in advance!
Mediawiki (1.16.2) PHP MySQL _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
After a lil RTFM'ing I found the solution, run this command out of an init script and it will not adjust the indexes in the same way the "update" script does.
*java -Xmx1024m -cp LuceneSearch.jar org.wikimedia.lsearch.oai.IncrementalUpdater -n -d -s 240 wikidb* * * * * On Mon, Jul 30, 2012 at 5:10 PM, Robert Stojnic rainmansr@gmail.com wrote:
Hi Zach,
No, WMF is using a version of that script as well. Not sure what is wrong in that case.
Cheers, r.
On 30/07/12 18:16, Zach Hilliard wrote:
Currently we are running OAI and using the "update" script supplied with lucene, is their another method outside of this? On Mon, Jul 30, 2012 at 02:34:46PM +0100, Robert Stojnic wrote:
Hi Zach,
Yes this is a known issue when using the ./build script with cron. WMF uses incremental updates which don't have such problems. I couldn't reproduce the problem once when I looked into it, I can only imagine it has something to do with previous build processes leaving the files in an inconsistent state. Some people have solved this problem by adding a rm -rf /path/to/your/index into cron before running the build script.
Cheers, Robert
On 30/07/12 06:53, Zach H. wrote:
I have a small Mediawiki (around 6000 pages) and currently using MWSearch(recent) and Lucene (2.1) and have a problem with Lucene where I get "java.io.**FileNotFoundException:.../**segments_u (No such file or directory)"; I have created a scripted solution around this but its slightly inefficient to rebuild my indexes THAT often as this happens 3-6 times a day. It seems a few people have posted about this issue on the Discussion portion of the Lucene page but no traction, is this issue just rare and caused by some incorrect configuration on my part? or do other wiki's have this issue as well? I am guessing the main Wikipedia is using the 2.1 branch of Lucene as it has "Did you mean" functionality which appears to be only apart of the 2.1 tree, and if this is true how do they deal with it? Any advice is appreciated, thanks in advance!
Mediawiki (1.16.2) PHP MySQL ______________________________**_________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.**org MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/mediawiki-lhttps://lists.wikimedia.org/mailman/listinfo/mediawiki-l
______________________________**_________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.**org MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/mediawiki-lhttps://lists.wikimedia.org/mailman/listinfo/mediawiki-l
______________________________**_________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.**org MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/mediawiki-lhttps://lists.wikimedia.org/mailman/listinfo/mediawiki-l
______________________________**_________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.**org MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/mediawiki-lhttps://lists.wikimedia.org/mailman/listinfo/mediawiki-l
mediawiki-l@lists.wikimedia.org