Hi!
While working on fulltext search for Lexemes, I have encountered a
question which I think needs to be discussed and resolved. The question
is how fulltext search should be working when dealing with different
content models and what search should do by default and in specialized
cases.
The main challenge in Wikidata is that we are dealing with substantially
different content models - articles, Items (including Properties,
because while being formally different type, they are similar enough to
Items for search to ignore the difference) and Lexemes organize their
data in a different way, and should be searched using different
specialized queries. This is currently unique for Wikidata, but SDC
might eventually have the same challenge to deal with. I've described
challenges and questions there are here in more detail:
https://www.wikidata.org/wiki/User:Smalyshev_(WMF)/Wikidata_ search#Fulltext_search
I'd like to first hear some feedback about what are the expectations
about the combined search are - what is expected to work, how it is
expected to work, what are the defaults, what are the use cases for
these. I have outlined some solutions that were proposed on wiki, if you
have any comments please feel welcome to respond either here or on wiki.
TLDR version of it is that doing search on different data models is
hard, and we would need to sacrifice something to make it work. We need
to figure out and decide which of these sacrifices are acceptable and
what is enabled/disabled by default.
Thanks,
--
Stas Malyshev
smalyshev@wikimedia.org
_______________________________________________
discovery-private mailing list
discovery-private@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/discovery- private