Indexing structures for Wikidata - Wikitech-l

8 Mar 2013

      As you probably know, the search in Wikidata sucks big time.
Until we have created a proper Solr-based search and deployed on that
infrastructure, we would like to implement and set up a reasonable stopgap
solution.
The simplest and most obvious signal for sorting the items would be to
1) make a prefix search
2) weight all results by the number of Wikipedias it links to
This should usually provide the item you are looking for. Currently, the
search order is random. Good luck with finding items like California,
Wellington, or Berlin.
Now, what I want to ask is, what would be the appropriate index structure
for that table. The data is saved in the wb_terms table, which would need
to be extended by a "weight" field. There is already a suggestion (based on
discussions between Tim and Daniel K if I understood correctly) to change
the wb_terms table index structure (see here <
https://bugzilla.wikimedia.org/show_bug.cgi?id=45529%3E ), but since we are
changing the index structure anyway it would be great to get it right this
time.
Anyone who can jump in? (Looking especially at Asher and Tim)
Any help would be appreciated.
Cheers,
Denny
-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.