On 19.08.2015 08:38, Stas Malyshev wrote: ...
Also, there's another thing. Suppose we have Q345 -> spouse -> Q123, but not Q123 -> spouse -> Q345, and we process entities, without loss of generality, in order of ascending IDs. When we generate data for Q123, we don't know yet that Q345 is linked to it, so in order to infer Q123 -> spouse -> Q345, we can't just load Q345 (we'd need to load it later anyway to get the qualifiers, etc.), since we don't know we'd need it, we'd probably somehow have to query the database (if we have suitable links table?) for every entry that has Q123 on the other end of "spouse". I'm not even sure it's possible currently on Wikidata (query service can easily do that, but not within 1ms), but even if it is, I don't see how it is cacheable and doing this for every entity for multiple relationships may be quite expensive.
That's an important concern for generating the live exports, but it does not actually matter for the dumps. RDF does not care about the order, so you can generate triples about Q123 when processing Q345. There are also other methods of taking advantage of inferences during query answering without having to precompute them first (based on query rewriting, which could be done by a service on top of the main SPARQL endpoint). Anyway, this really needs a bit more thought before it should be part of the main SPARQL endpoint. I will write another email on this ...
Markus