Hi!
I think we already index way more than P31 and P279.
Oh yes, all the string properties.
So I think that the increase is smaller than what you anticipate. What I'd try to avoid in general is indexing terms that have only doc since they are pretty useless.
For unique string properties, that would be a frequent occurrence. But I am not sure why it's useless - won't it be a legit use case to look up something by external ID?
I think we should investigate what kind of data we may have here, and at least for statement_keywords I would not index data that contain random text (esp. natural language) since they are prone to be unique and impossible to search.
Yes, we definitely should not do that. I tried to exclude such properties but if you notice more of them, let's add them to exclusion config.