On 19.08.2015 08:38, Stas Malyshev wrote:
...
Also, there's another thing. Suppose we have Q345 -> spouse -> Q123, but
not Q123 -> spouse -> Q345, and we process entities, without loss of
generality, in order of ascending IDs. When we generate data for Q123,
we don't know yet that Q345 is linked to it, so in order to infer Q123
-> spouse -> Q345, we can't just load Q345 (we'd need to load it later
anyway to get the qualifiers, etc.), since we don't know we'd need it,
we'd probably somehow have to query the database (if we have suitable
links table?) for every entry that has Q123 on the other end of
"spouse". I'm not even sure it's possible currently on Wikidata (query
service can easily do that, but not within 1ms), but even if it is, I
don't see how it is cacheable and doing this for every entity for
multiple relationships may be quite expensive.
That's an important concern for generating the live exports, but it does
not actually matter for the dumps. RDF does not care about the order, so
you can generate triples about Q123 when processing Q345. There are also
other methods of taking advantage of inferences during query answering
without having to precompute them first (based on query rewriting, which
could be done by a service on top of the main SPARQL endpoint). Anyway,
this really needs a bit more thought before it should be part of the
main SPARQL endpoint. I will write another email on this ...
Markus