I see Kevin scooped me...  but that's okay.

Even though you would expect SPARQL query logs to contain SPARQL queries, I wouldn't be shocked if there was other stuff in there.

Unlike generic search queries, you can validate that SPARQL queries are well formed and only share the well formed ones. I quickly found an online SPARQL validator; there's probably a repo somewhere on GitHub with one we could use. Just a thought.

How easy is it to encode/include PII in a valid SPARQL query? Hmmm.


Trey Jones
Software Engineer, Discovery
Wikimedia Foundation


On Thu, Jan 14, 2016 at 12:37 PM, Stas Malyshev <smalyshev@wikimedia.org> wrote:
Hi!

I was asked about getting access to query logs for Wikidata Query
Service, for research purposes. So I'd like to start the discussion on
it, specifically:

1. Can we do it at all - technically, legally, privacy-wise? (note we're
talking about SPARQL query text only, no other information to be provided)

2. Are there any considerations why we may want *not* to do it even if
we could?

3. How hard would it be to make such export and do we have any existing
infrastructure that should be used for this?

All ideas/comments about providing (or not providing :) access to this
data are welcome.
--
Stas Malyshev
smalyshev@wikimedia.org

_______________________________________________
discovery mailing list
discovery@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/discovery