This is a fork of this thread: http://www.gossamer-threads.com/lists/wiki/wikitech/228949?page=last
Is there a possibility that I could instead (easily) merge the handful of individual wiki's content into one consolidated wiki and implement a more enhanced search against it instead? I've no experience performing a merge like this so expert advice would be appreciated!
Thanks - Tod
On Thu, Mar 24, 2011 at 1:46 AM, Tod listacctc@gmail.com wrote:
This is a fork of this thread: http://www.gossamer-threads.com/lists/wiki/wikitech/228949?page=last
Is there a possibility that I could instead (easily) merge the handful of individual wiki's content into one consolidated wiki and implement a more enhanced search against it instead? I've no experience performing a merge like this so expert advice would be appreciated!
Have you tried dumping all of your wikis (see http://www.mediawiki.org/wiki/XML_dump#XML_dump), and importing all of the dumps into one wiki?
On 03/26/2011 12:03 AM, Andrew Garrett wrote:
On Thu, Mar 24, 2011 at 1:46 AM, Todlistacctc@gmail.com wrote:
This is a fork of this thread: http://www.gossamer-threads.com/lists/wiki/wikitech/228949?page=last
Is there a possibility that I could instead (easily) merge the handful of individual wiki's content into one consolidated wiki and implement a more enhanced search against it instead? I've no experience performing a merge like this so expert advice would be appreciated!
Have you tried dumping all of your wikis (see http://www.mediawiki.org/wiki/XML_dump#XML_dump), and importing all of the dumps into one wiki?
I'm working with a group that has customized their mediawiki installation so that they can create multiple wiki instances, each with their own database, that have no knowledge of each others existence. This was done for security and privacy reasons.
The dumpBackup.php utility seems to be built for a single mediawiki instance that uses a single database so there is no option to pass it a database name to dump.
I can dump each individual databases and try to merge them. Has anybody tried that?
Thanks - Tod
On 11-03-28 08:09 AM, Tod wrote:
On 03/26/2011 12:03 AM, Andrew Garrett wrote:
On Thu, Mar 24, 2011 at 1:46 AM, Todlistacctc@gmail.com wrote:
This is a fork of this thread: http://www.gossamer-threads.com/lists/wiki/wikitech/228949?page=last
Is there a possibility that I could instead (easily) merge the handful of individual wiki's content into one consolidated wiki and implement a more enhanced search against it instead? I've no experience performing a merge like this so expert advice would be appreciated!
Have you tried dumping all of your wikis (see http://www.mediawiki.org/wiki/XML_dump#XML_dump), and importing all of the dumps into one wiki?
I'm working with a group that has customized their mediawiki installation so that they can create multiple wiki instances, each with their own database, that have no knowledge of each others existence. This was done for security and privacy reasons.
The dumpBackup.php utility seems to be built for a single mediawiki instance that uses a single database so there is no option to pass it a database name to dump.
I can dump each individual databases and try to merge them. Has anybody tried that?
Thanks - Tod
Our Maintenance class includes a --wiki param which accepts something like --wiki=thedb or --wiki=thedb-theprefix and sets MW_DB and MW_PREFIX constants you can use in your LocalSettings.
Though I could have sworn rather than that we had a proper wiki setting that set a variable with a wiki id the LocalSettings could use in the cli in place of the domain testing you have for web requests.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
I developed a corresponding sync. tool and a search engine (already public) which works fine :-). Hope the sync. tool will be public soon.
c u stevie www.webserver-management.de
-----Ursprüngliche Nachricht----- Von: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] Im Auftrag von Daniel Friesen Gesendet: Montag, 28. März 2011 18:33 An: Wikimedia developers Betreff: Re: [Wikitech-l] Merging content from multiple wikis
On 11-03-28 08:09 AM, Tod wrote:
On 03/26/2011 12:03 AM, Andrew Garrett wrote:
On Thu, Mar 24, 2011 at 1:46 AM, Todlistacctc@gmail.com wrote:
This is a fork of this thread: http://www.gossamer-threads.com/lists/wiki/wikitech/228949?page=last
Is there a possibility that I could instead (easily) merge the handful of individual wiki's content into one consolidated wiki and implement a more enhanced search against it instead? I've no experience performing a merge like this so expert advice would be
appreciated!
Have you tried dumping all of your wikis (see http://www.mediawiki.org/wiki/XML_dump#XML_dump), and importing all of the dumps into one wiki?
I'm working with a group that has customized their mediawiki installation so that they can create multiple wiki instances, each with their own database, that have no knowledge of each others existence. This was done for security and privacy reasons.
The dumpBackup.php utility seems to be built for a single mediawiki instance that uses a single database so there is no option to pass it a database name to dump.
I can dump each individual databases and try to merge them. Has anybody tried that?
Thanks - Tod
Our Maintenance class includes a --wiki param which accepts something like --wiki=thedb or --wiki=thedb-theprefix and sets MW_DB and MW_PREFIX constants you can use in your LocalSettings.
Though I could have sworn rather than that we had a proper wiki setting that set a variable with a wiki id the LocalSettings could use in the cli in place of the domain testing you have for web requests.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
_______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 03/28/2011 3:18 PM, Stefan Werthmann wrote:
I developed a corresponding sync. tool and a search engine (already public) which works fine :-). Hope the sync. tool will be public soon.
Could you point me to it?
c u stevie www.webserver-management.de
-----Urspr�ngliche Nachricht----- Von: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] Im Auftrag von Daniel Friesen Gesendet: Montag, 28. M�rz 2011 18:33 An: Wikimedia developers Betreff: Re: [Wikitech-l] Merging content from multiple wikis
On 11-03-28 08:09 AM, Tod wrote:
On 03/26/2011 12:03 AM, Andrew Garrett wrote:
On Thu, Mar 24, 2011 at 1:46 AM, Todlistacctc@gmail.com wrote:
This is a fork of this thread: http://www.gossamer-threads.com/lists/wiki/wikitech/228949?page=last
Is there a possibility that I could instead (easily) merge the handful of individual wiki's content into one consolidated wiki and implement a more enhanced search against it instead? I've no experience performing a merge like this so expert advice would be
appreciated!
Have you tried dumping all of your wikis (see http://www.mediawiki.org/wiki/XML_dump#XML_dump), and importing all of the dumps into one wiki?
I'm working with a group that has customized their mediawiki installation so that they can create multiple wiki instances, each with their own database, that have no knowledge of each others existence. This was done for security and privacy reasons.
The dumpBackup.php utility seems to be built for a single mediawiki instance that uses a single database so there is no option to pass it a database name to dump.
I can dump each individual databases and try to merge them. Has anybody tried that?
Thanks - Tod
Our Maintenance class includes a --wiki param which accepts something like --wiki=thedb or --wiki=thedb-theprefix and sets MW_DB and MW_PREFIX constants you can use in your LocalSettings.
Though I could have sworn rather than that we had a proper wiki setting that set a variable with a wiki id the LocalSettings could use in the cli in place of the domain testing you have for web requests.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Tod wrote:
I'm working with a group that has customized their mediawiki installation so that they can create multiple wiki instances, each with their own database, that have no knowledge of each others existence. This was done for security and privacy reasons.
The dumpBackup.php utility seems to be built for a single mediawiki instance that uses a single database so there is no option to pass it a database name to dump.
I can dump each individual databases and try to merge them. Has anybody tried that?
Dumping the dbs and trying to merge the databases will have duplicated keys everywhere. It could be done, but I don't suggest you to try going that way.
PS: It is fun that they were splitted for security and privacy and now search will be common, which voids those properties.
On 03/28/2011 6:49 PM, Platonides wrote:
Tod wrote:
I'm working with a group that has customized their mediawiki installation so that they can create multiple wiki instances, each with their own database, that have no knowledge of each others existence. This was done for security and privacy reasons.
The dumpBackup.php utility seems to be built for a single mediawiki instance that uses a single database so there is no option to pass it a database name to dump.
I can dump each individual databases and try to merge them. Has anybody tried that?
Dumping the dbs and trying to merge the databases will have duplicated keys everywhere. It could be done, but I don't suggest you to try going that way.
PS: It is fun that they were splitted for security and privacy and now search will be common, which voids those properties.
Yes, I laughed myself to sleep last night :) BTW, I tried dumping the DB's and you are correct.
I do have another solution. I've already indexed the wiki's with SOLR. How hard would it be for me to integrate those search results into the wiki presentation replacing the onboard Mediawiki search?
.. This might belong on a new thread...
Thanks again all!
On 29 March 2011 14:27, Tod listacctc@gmail.com wrote:
I do have another solution. I've already indexed the wiki's with SOLR. How hard would it be for me to integrate those search results into the wiki presentation replacing the onboard Mediawiki search?
There is already Lucene based search system for Mediawiki called lucene-search [1]. It is somewhat Wikimedia specific but general use does not seem impossible.
[1] http://www.mediawiki.org/wiki/Extension:Lucene-search
-Niklas
On 03/29/2011 7:43 AM, � wrote:
On 29 March 2011 14:27, Todlistacctc@gmail.com wrote:
I do have another solution. �I've already indexed the wiki's with SOLR. �How hard would it be for me to integrate those search results into the wiki presentation replacing the onboard Mediawiki search?
There is already Lucene based search system for Mediawiki called lucene-search [1]. It is somewhat Wikimedia specific but general use does not seem impossible.
[1] http://www.mediawiki.org/wiki/Extension:Lucene-search
-Niklas
I've looked at that as well as SphinxSearch extension. It looks like SphinxSearch is less work to get up and running but I still have a concern as to whether it can index multiple databases at once yet still present the search results within a single wiki instance.
Thanks - Tod
On 11-03-29 05:19 AM, Tod wrote:
On 03/29/2011 7:43 AM, � wrote:
On 29 March 2011 14:27, Todlistacctc@gmail.com wrote:
I do have another solution. �I've already indexed the wiki's with SOLR. �How hard would it be for me to integrate those search results into the wiki presentation replacing the onboard Mediawiki search?
There is already Lucene based search system for Mediawiki called lucene-search [1]. It is somewhat Wikimedia specific but general use does not seem impossible.
[1] http://www.mediawiki.org/wiki/Extension:Lucene-search
-Niklas
I've looked at that as well as SphinxSearch extension. It looks like SphinxSearch is less work to get up and running but I still have a concern as to whether it can index multiple databases at once yet still present the search results within a single wiki instance.
Thanks - Tod
SphinxSearch also has a number of other disadvantages: - The 'did you mean' is only a spellcheck, it's not index based and not sensitive to your wiki's actual content so suggestions can be quite bad in some cases. They were so bad for me the feature was useless. - The interface for SphinxSearch seams to be out of date - The index cannot be updated live on-save, so there is delay between changing content and it getting into the index - Sphinx queries the sql directly so it's incompatible with compressed revs, external storage, and essentially any attempt at making storage of page text efficient in any way at all
I actually looked at Solr earlier on today (ok, actually yesterday, I couldn't get to sleep). It would be an interesting project, Solr can support most of the good search features we get from Lucene, but in a more widely used package. Course it won't meet Wikimedia's performance needs, and Lucene-search might work better for some things, but for the rest of it it's probably a good alternative to trying to get Lucene-search working.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
On 29.03.2011 18:15, Daniel Friesen wrote:
SphinxSearch also has a number of other disadvantages:
- The 'did you mean' is only a spellcheck, it's not index based and not
sensitive to your wiki's actual content so suggestions can be quite bad in some cases. They were so bad for me the feature was useless.
- The interface for SphinxSearch seams to be out of date
- The index cannot be updated live on-save, so there is delay between
changing content and it getting into the index
- Sphinx queries the sql directly so it's incompatible with compressed
revs, external storage, and essentially any attempt at making storage of page text efficient in any way at all
New version of Sphinx can update index on the fly, however I am not sure it's compatible with the extension. Another points are probably true, however compressed revs and external storage is probably not required for not so huge wikis it's being targeted for. Dmitriy
On 23.03.2011 17:46, Tod wrote:
This is a fork of this thread: http://www.gossamer-threads.com/lists/wiki/wikitech/228949?page=last
Is there a possibility that I could instead (easily) merge the handful of individual wiki's content into one consolidated wiki and implement a more enhanced search against it instead? I've no experience performing a merge like this so expert advice would be appreciated!
Thanks - Tod
If you are still monitoring the list, I probably would try to merge xml dumps (only --current should be enough for search), then I'd import merged dump into "common" wiki. Should not be too hard, if not performance issues for very large wikis. If your wikis aren't very large, that is possible. However, that is offline slow process and adding of new pages has to be performed with slow re-merge. One may use remote api calls at individual wikis to duplicate-write every saved "individual" article into the "common" wiki, however an extension has do be developed for that. This way is much faster and keeps in sync. Dmitriy
Dmitriy Sintsov wrote:
If you are still monitoring the list, I probably would try to merge xml dumps (only --current should be enough for search), then I'd import merged dump into "common" wiki. Should not be too hard, if not performance issues for very large wikis. If your wikis aren't very large, that is possible. However, that is offline slow process and adding of new pages has to be performed with slow re-merge. One may use remote api calls at individual wikis to duplicate-write every saved "individual" article into the "common" wiki, however an extension has do be developed for that. This way is much faster and keeps in sync. Dmitriy
There's also the transwiki feature.
wikitech-l@lists.wikimedia.org