Hi!
That presents a problem. While you see "instance of": "human", the data is P31:Q5. We can, of course, put "instance of": "human" in the index. But what if label for Q5 changes? Now we have to re-index 10 million records.
I haven't thought this through, but would it be possible to index just Q5, and then when someone searches on "human" to see what are all the items with the label "human", so that the search becomes "human OR Q5"?
That has a potential to explode pretty quickly. Consider query like "movie Bruce Willis" - where obviously you want all movies where Bruce Willis starred. Now, if we search for "movie", we get tons of potential matches. If we search for "Bruce" and "Willis" - even more. Now if we stuff all those IDs we've received in our query we'll get something very far from what you intended, and the relevance would be pretty bad. Not to mention you have to actually run four queries instead of one (4x load) and the last one is pretty fat, stuffed with all the IDs we've gathered.
But that's not the end of it - you don't just want any item that is somehow related to movies - you want items that *are* movies. And you don't want any item that is somehow related to somebody named "Bruce" or "Willis". You want the ones where the famous actor Bruce Willis played (or maybe directed). But there's no such information in the query.