Copying Lydia Pintscher and Daniel Kinzler (with whom I’ve discussed this very topic).I am interested in metrics that describe how Wikidata is used. While we do have views on individual pages, that doesn’t tell us how often entities are accessed through Special:EntityData or wbgetclaims. Nor does it tell us how often statements/RDF triples show up in the Wikidata Query Service. Does this data already exist, even in the form of raw access logs? If not, what effort would be required to gather this data? For the purposes of my proposal to the U.S. Census Bureau I am estimating around six weeks of effort for this for one person working full-time. If it will take more time I will need to know.Thank you,James HareOn Thursday, June 2, 2016 at 2:18 PM, Nuria Ruiz wrote:
James:>My current operating assumption is that it would take one person, working on a full time basis, around six weeks to go from raw access logs >to a functioning API that would provide information on how many times a Wikidata entity was accessed through the various APIs and the >query service. Do you believe this to be an accurate level of effort estimation based on your experience with past projects of this nature?You are starting from the assumption that we do have the data you are interested in in the logs which I am not sure it is the case, have you done you checks on this regard with wikidata developers?Analytics 'automagically' collects data from logs about page requests, any other requests collections (and it seems that yours fit on this scenario) need to be instrumented. I would send an e-mail to analytics@ public list and wikidata folks to ask about how to harvest the data you are interested in, it doesn't sound like it is being collected at this time so your project scope might be quite a bit bigger than you think.Thanks,NuriaOn Thu, Jun 2, 2016 at 5:06 AM, James Hare <james@hxstrategy.com> wrote:Hello Nuria,I am currently developing a proposal for the U.S. Census Bureau to integrate their datasets with Wikidata. As part of this, I am interested in getting Wikidata usage metrics beyond the page view data currently available. My concern is that the page views API gives you information only on how many times a page is accessed – but Wikidata is not really used in this way. More often is it the case that Wikidata’s information is accessed through the API endpoints (wbgetclaims etc.), through Special:EntityData, and the Wikidata Query Service. If we have information on usage through those mechanisms, that would give me much better information on Wikidata’s usage.To the extent these metrics are important to my prospective client, I am willing to provide in-kind support to the analytics team to make this information available, including expenses associated with the NDA process (I understand that such a person may need to deal with raw access logs that include PII.) My current operating assumption is that it would take one person, working on a full time basis, around six weeks to go from raw access logs to a functioning API that would provide information on how many times a Wikidata entity was accessed through the various APIs and the query service. Do you believe this to be an accurate level of effort estimation based on your experience with past projects of this nature?Please let me know if you have any questions. I am happy to discuss my idea with you further.Regards,James Hare