I see Kevin scooped me... but that's okay.
Even though you would expect SPARQL query logs to contain SPARQL queries, I wouldn't be shocked if there was other stuff in there.
Unlike generic search queries, you can validate that SPARQL queries are well formed and only share the well formed ones. I quickly found an online SPARQL validator; there's probably a repo somewhere on GitHub with one we could use. Just a thought.
How easy is it to encode/include PII in a valid SPARQL query? Hmmm.
Trey Jones Software Engineer, Discovery Wikimedia Foundation
On Thu, Jan 14, 2016 at 12:37 PM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
I was asked about getting access to query logs for Wikidata Query Service, for research purposes. So I'd like to start the discussion on it, specifically:
- Can we do it at all - technically, legally, privacy-wise? (note we're
talking about SPARQL query text only, no other information to be provided)
- Are there any considerations why we may want *not* to do it even if
we could?
- How hard would it be to make such export and do we have any existing
infrastructure that should be used for this?
All ideas/comments about providing (or not providing :) access to this data are welcome. -- Stas Malyshev smalyshev@wikimedia.org
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery