Am 01.07.2016 um 01:42 schrieb Nuria Ruiz:
Is this data always requested via http from an api endpoint that will hit a varnish cache? (Daniel can probably answer this)
Yes. Special:EntityData is a regular special page, and action=wbgetentities is a regular MW web API request, as your example shows.
If the data you are interested in can be inferred from these requests there is no additional data gathering needed.
Yay!
Nor does it tell us how often statements/RDF triples show up in the Wikidata Query Service.
I'm no expert on the query service, adding Stas for that. As far as I know, SPARQL queries go through Varnish directly to BlazeGraph. In any case, they are not processed by MediaWiki at all. Tracking how often an entity is mentioned in a GET request to the SPARQL service should be possible based on the varnish request logs, with a bit of regex magic. POST requests are more tricky, I suppose.
However, I don't think we are logging the contents of responses at all. I suppose that would have to be build into BlazeGraph somehow. And even if we did that, that would only tell use which entities were present in a result, not which entities were used to answer a query. E.g. if you list all instances of a class (including subclasses), the entities representing the classes are essential to answering the query, but they are not present in the result (and only the top-most class is present in the query).