Those logs could have been cleaned up further and re-released, especially
since the privacy issues had an impact only on "small percentage of queries
".
Frankly, it's a pity that after the initial announcement they had to
quickly retract.
Nonetheless Wikimedia could release them for research purposes, asking
interested users to sign NDA or such.
I would be very surprised to discover that in 2015 there are no means to
properly anonymize datasets and release them to the public.
best,
Valerio
On Fri, Apr 3, 2015 at 4:38 PM, Aaron Halfaker <aaron.halfaker(a)gmail.com>
wrote:
Maybe someone managed to grab those logs before they
took them offline.
If they did, I hope they won't share. They were taken offline due to
privacy issues.
From the blog post: "We’ve temporarily taken down this data to make
additional improvements to the anonymization protocol related to the search
queries."
-Aaron
On Fri, Apr 3, 2015 at 9:32 AM, Valerio Schiavoni <
valerio.schiavoni(a)gmail.com> wrote:
> There has been at least one attempt to release such data:
>
>
>
http://blog.wikimedia.org/2012/09/19/what-are-readers-looking-for-wikipedia…
>
> Maybe someone managed to grab those logs before they took them offline.
>
> Similar but older logs are available here:
>
http://www.wikibench.eu/
>
> best,
> valerio
>
>
> On Fri, Apr 3, 2015 at 4:09 PM, Pine W <wiki.pine(a)gmail.com> wrote:
>
>> Hi Oliver,
>>
>> Do we even record search logs? It might be a good idea if we didn't.
>>
>> Pine
>> On Apr 3, 2015 6:16 AM, "Simon Givoli" <givolis(a)gmail.com>
wrote:
>>
>>> Thanks Oliver,
>>>
>>>
>>> Sorry if I wasn't clear enough.
>>> My dissertation will involve consented participants. Their search logs
>>> will be recorded while searching Wikipedia. The search logs will then be
>>> analyzed in order to find recurrent search patterns across participants.
>>> Before beginning the experiment, I want to check that I can indeed find
>>> patterns in search logs, using several different algorithms. The idea is to
>>> check these algorithms on Wikipedia search logs already available. Hence my
>>> request.
>>>
>>> Simon
>>> Message: 5
>>> Date: Fri, 3 Apr 2015 09:37:33 +0300
>>> From: Simon Givoli <givolis(a)gmail.com>
>>> To: wiki-research-l(a)lists.wikimedia.org
>>> Subject: [Wiki-research-l] Wikipedia search logs needed
>>> Message-ID:
>>> <CAN=
>>> 5oGhNQS+KnObeNmsB1cWHFM-w-mXgQaS5GcNFn7qrU1kuRg(a)mail.gmail.com>
>>> Content-Type: text/plain; charset="utf-8"
>>>
>>> Hi,
>>>
>>> I'm looking for a dump or db of Wikipedia users search logs. I would
>>> like
>>> it to be with recent data, but it doesn't have to be extensive, even a
>>> small sample size would be sufficient. I aim to use this db to test a
>>> new
>>> research tool I'm developing for my dissertation.
>>>
>>> Can anyone point me to a relevant source?
>>>
>>> Thanks'
>>> Simon
>>>