Perhaps you could use the query log (just the list of SPARQL queries) and utilize an offline installation of the query service to execute them and generate aggregate statistics?  e,g, Q76 appeared in the results of X queries submitted in January 2016..   That seems like a doable summer student project and also seems like it ought to be okay to share such aggregate results - pretty much the same thing as page view statistics but for a database.  

?
 

On Tue, Apr 26, 2016 at 2:40 AM, Lydia Pintscher <Lydia.Pintscher@wikimedia.de> wrote:
On Tue, Apr 26, 2016 at 1:30 AM Benjamin Good <ben.mcgee.good@gmail.com> wrote:
I'll start with the simple question than give the longer context.  Is there any way to know how many times an item or a claim appears in the results of a query to query.wikidata.org ?   Are there any other ways to quantify query/application usage of specific wikidata content?

Background.  The gene wiki people recently attended a conference on 'biocuration' (the construction and maintenance of biological databases) where we gave multiple wikidata-related presentations.  The community there generally had a very positive reaction to what we have been doing but many were concerned about attribution.  They wanted to know that when data was imported into wikidata from their resources (e.g. the Gene Ontology), that there was some way to ensure that the world knew where it came from so that the authors could get appropriate credit (which translates into grant money which translates into their jobs).  We explained the reference model to them, which helped, but still they are concerned.  

The most important consequence of moving data into wikidata is that it can get used - sometimes a lot! (e.g. when displayed on Wikipedia articles).  If we could quantify usage for data providers, it would really help them make the argument to their funding sources that contributing to wikidata increases their impact.  If we can get that across, it would help bring more people, more high quality data, and more funding into the wikidata fold.  

thoughts?

Currently there is no way to tell how much a particular data point is used in query results to query.wikidata.org. I am not even sure if there is a meaningful way to do this. We can't give access to query logs without an NDA with the Wikimedia Foundation for privacy reasons.
As for usage on Wikipedia we do have statistics on that and that is available in the database on labs. But those are on the level of the whole item or sitelinks, not a particular statement. We'll look into making this more accessible on-wiki.

Cheers
Lydia
--
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata