The best I could offer you Felix is a very small subset of queries that have been manually reviewed for release. These queries are within our result grading platform, Discernatron. You will need to first login at https://discernatron.wmflabs.org/ and then visit https://discernatron. wmflabs.org/scores/all?json=1. This will output a list of query results that have been graded, from that you can extract the individual queries that were used. You may be able to use these scores for an nDCG calculation. Unfortunately the list of graded queries is very small. There are 95 unique queries, and 4219 scored result pages.
On Wed, Aug 17, 2016 at 9:51 AM, Pine W wiki.pine@gmail.com wrote:
Hi Felix,
There was recently a discussion about releasing raw queries, and the decision was made by WMF not to release raw queries for privacy reasons. Personally, I support that decision because the risks seem to far outweigh the benefits. The staff from Discovery may be able to provide you with more detail or alternatives, but I would say that the odds of releasing raw data from is low.
Sometimes WMF allows access to sensitive data if an NDA is signed. In this case, I feel that the risks are too high even for that to be allowed. That's a personal opinion only; the official answer will come from WMF.
Pine
On Aug 17, 2016 08:39, "Tilman Bayer" tbayer@wikimedia.org wrote:
CCing the WMF Search and Discovery mailing list (https://lists.wikimedia.org/mailman/listinfo/discovery )
On Wed, Aug 17, 2016 at 6:00 AM, Felix Engelmann fengelmann@uni-koblenz.de wrote:
Hi everybody,
I’m currently writing by bachelor thesis at University Koblenz,
Germany. The goal is to improve Wikipedia search by exploiting the text structure of Wikipedia articles. To conduct unbiased user studies I need real world queries so I can compare the novel algorithms agains the currently used ones. Are there any query logs existing which I can use for this purpose?
Thanks for your help! Felix Engelmann _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- Tilman Bayer Senior Analyst Wikimedia Foundation IRC (Freenode): HaeB
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery