Not sure if the analysis has to expose any private data at all, you show the result of the analysis and that would integrate over weeks or months and perhaps after filtering out random noise. Would that be a privacy problem?
One of the tricky things is that the disambiguation or search page is a signal that the referrer or some other previous page in the users history is difficult to connect to some later page. When the number of steps between the pages are increasing the problem of detecting the relation increases exponentially. It is also worth noting that by only using click events on the disambiguation page you will only discover connections that are already present as links on the disambiguation page.
On Wed, Jul 17, 2013 at 6:49 PM, Jon Robson jdlrobson@gmail.com wrote:
Agreed. As a first step, if someone is interested in this and this doesn't go against our privacy policy it would be good to collect some link clicking data for various disambiguation pages to get an idea of whether the data created is meaningful and useful. Tyler's concerns are valid but we should clarify with some data rather than speculate to whether these are indeed concerns we need to worry about and whether this. EventLogging [1] could be used for this in my opinion using some simple javascript that hijacks links on the disambiguation page - looking at referrer and next page.
In terms of analyzing the data you could then simply look at a sample of disambiguation pages and manually determine the accuracy of users picking the correct link.
If the data does show promise it would then be an easy enough job to create a UI to use it and for editors to correct them.
I don't currently have time to explore this but would like to in future but if anyone is interested please dive in...
[1] https://mediawiki.org/wiki/Extension:EventLogging
On Wed, Jul 17, 2013 at 5:14 AM, C. Scott Ananian cananian@wikimedia.org wrote:
Sounds like a disagreement that can be settled quantitatively. ;) --scott On Jul 17, 2013 5:03 AM, "Tyler Romeo" tylerromeo@gmail.com wrote:
On Wed, Jul 17, 2013 at 4:42 AM, John Erling Blad jeblad@gmail.com wrote:
It doesn't matter because the correct behavior will accumulate over time. You don't try to "fix" linkage just because you have one single observed behavior, you collect and correlate behavior over time and use several, perhaps hundreds of observations.
I strongly doubt that the correct behavior will be prevalent enough to warrant using such an automatic system over just manually fixing disambiguation links, which can be done quite easily using automatic wiki browsers and the like.
*-- * *Tyler Romeo* Stevens Institute of Technology, Class of 2016 Major in Computer Science www.whizkidztech.com | tylerromeo@gmail.com _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Jon Robson http://jonrobson.me.uk @rakugojon
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l