(1) seems like the right way to go to me too.
There may be other ways but puppet/files/lucene/lucene.jobs.sh has a function called import-db() which creates a dump like this:
php $MWinstall/common/multiversion/MWScript.php dumpBackup.php $dbname --current > $dumpfile
Ram
On Thu, Mar 7, 2013 at 1:05 PM, Daniel Kinzler daniel@brightbyte.de wrote:
On 07.03.2013 20:58, Brion Vibber wrote:
- The indexer code (without plugins) should not know about Wikibase,
but it may
have hard coded knowledge about JSON. It could have a special indexing
mode for
JSON, in which the structure is deserialized and traversed, and any
values are
added to the index (while the keys used in the structure would be
ignored). We
may still be indexing useless interna from the JSON, but at least there
would be
a lot fewer false negatives.
Indexing structured data could be awesome -- again I think of file metadata as well as wikidata-style stuff. But I'm not sure how easy that'll be. Should probably be in addition to the text indexing, rather than replacing.
Indeed, but option 3 is about *blindly* indexing *JSON*. We definitly want indexed structured data, the question is just how to get that into the LSearch infrastructure.
-- daniel
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l