The first thing I proposed is innocuous (gathering stats on (revision_id, clicked_link)), and in fact can be done easily with a minimum of instrumentation.
The second is very different from the AOL search data. The AOL search data was problematic because it associated data on a per-user basis, so you could use some queries to figure out who the user was, and then see the other queries of the user. I am suggesting here to instead gather anonymous statistics on: (<was on page A>, did a search, <landed on page B>), keeping track only of the (A, B) pairs, without user information.
But the problem is that gathering such anonymous logs takes effort, is difficult to do securely, is difficult to avoid someone tamper with it and add back information that should not have been there, and it is difficult to then present the information to Wikipedia editors in a way that helps them meaningfully improve pages.
So perhaps the first statistic is the only useful one.
I would also be curious to know, once a user enters, what % of next visits are due to the visitor clicking on links, vs. doing a search.
Luca
On Sun, Apr 11, 2010 at 6:28 PM, Anthony wikimail@inbox.org wrote:
On Sun, Apr 11, 2010 at 6:27 PM, Luca de Alfaro luca@dealfaro.org wrote:
I guess that Wiki(pedia|media) could very well gather statistics on
(revision_id, clicked_link)
pairs without compromising the anonimity of the visitors. It would be very useful to have indications on which hyperlinks are most useful. For example, I am always curious whether the large editorial effort to curate categories is worth it. And also, if one had data on:
(revision_id, "search terms used in next search"),
one could infer which links are actually missing.
Seems to me like that (especially the latter), would need to be done extremely carefully to avoid compromising the anonymity of the visitors. Although it's not quite as bad it seems reminiscent of the " http://en.wikipedia.org/wiki/AOL_search_data_scandal", especially with regard to "search terms used in next search".
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l