How would folks feel about a public log of WikiMetrics uses in this form
2013-09-18 20:01:47 [some username] Edits - Oregon newbies 2013-09-18 19:23:18 [some username] Bytes Added - Oregon2013
We're tracking this info right now without sharing it. I don't feel it's particularly sensitive, and would give us a good shared understanding of who's using the tool & how. It does expose cohort names, but I'm not sure why that would be an issue. In a way, it seems only fair that if we're putting our users under the microscope, we should also be comfortable with publicly logging what we're doing.
Thanks, Erik
On Tue, Oct 29, 2013 at 7:09 PM, Erik Moeller erik@wikimedia.org wrote:
How would folks feel about a public log of WikiMetrics uses in this form
2013-09-18 20:01:47 [some username] Edits - Oregon newbies 2013-09-18 19:23:18 [some username] Bytes Added - Oregon2013
We're tracking this info right now without sharing it. I don't feel it's particularly sensitive, and would give us a good shared understanding of who's using the tool & how. It does expose cohort names, but I'm not sure why that would be an issue. In a way, it seems only fair that if we're putting our users under the microscope, we should also be comfortable with publicly logging what we're doing.
Thanks, Erik
Yep, good idea. A RecentChanges for Wikimetrics. :-)
On Tue, Oct 29, 2013 at 7:09 PM, Erik Moeller erik@wikimedia.org wrote:
How would folks feel about a public log of WikiMetrics uses in this form
2013-09-18 20:01:47 [some username] Edits - Oregon newbies 2013-09-18 19:23:18 [some username] Bytes Added - Oregon2013
We're tracking this info right now without sharing it. I don't feel it's particularly sensitive, and would give us a good shared understanding of who's using the tool & how. It does expose cohort names, but I'm not sure why that would be an issue. In a way, it seems
Exposing cohort names is not a good idea as that can leak sensitive information about an editor who is part of that cohort. Imagine a cohort name like "Berlin Edithon", this is one of the reasons why we want to introduce user roles in Wikimetrics where WMF employees can share cohorts with each other but community members cannot.
Besides that detail, sounds like a fun idea :)
only fair that if we're putting our users under the microscope, we should also be comfortable with publicly logging what we're doing.
Thanks, Erik
-- Erik Möller VP of Engineering and Product Development, Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Erik,
I am not very excited about this proposal, I understand the intention but I feel it conveys the wrong message.
1) logging requests made via Wikimetrics puts it on par with CheckUser, or other tools designed for detecting abuse based on *private data*. CheckUser requests are logged for a reason, in order to keep a record of who is performing policy-enforcing actions and to detect potential abuse of the tool itself. By publishing a log of Wikimetrics requests we imply that individual Wikimetrics users might be liable/accountable for their use of the tool, which – it's worth reminding – is strictly equivalent to running queries on DB replicas or against the MediaWiki API. Shouldn’t we start to log and publish whoever is querying Labs DB? Or every registered user visiting Special:Contributions?
2) The flipside of this proposal is that it’s trivial to craft controversial log entries. Wikimetrics is accessible to anyone with a valid OAuth/Google account and by design it allows anyone to create cohorts from arbitrary sets of user IDs. I can impersonate another person by creating a fake email address and upload a cohort called “suspicious_terrorists_in_itwiki” which will result in a nicely log entry and story.
I’m not 100% against the proposal but I don’t see its value and I want to challenge the principle that Wikimetrics requests should be considered *transactions*. If anything I’d like to have LCA’s take on this.
Dario
On Oct 29, 2013, at 7:09 PM, Erik Moeller erik@wikimedia.org wrote:
How would folks feel about a public log of WikiMetrics uses in this form
2013-09-18 20:01:47 [some username] Edits - Oregon newbies 2013-09-18 19:23:18 [some username] Bytes Added - Oregon2013
We're tracking this info right now without sharing it. I don't feel it's particularly sensitive, and would give us a good shared understanding of who's using the tool & how. It does expose cohort names, but I'm not sure why that would be an issue. In a way, it seems only fair that if we're putting our users under the microscope, we should also be comfortable with publicly logging what we're doing.
Thanks, Erik
-- Erik Möller VP of Engineering and Product Development, Wikimedia Foundation
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Heh. I sense that a proposal that doesn't have strong backing from any of the Big Ds is not going to progress very far. I'm holding out hope that Dan will weigh in. ;-) In seriousness, here's why I think it's worth doing:
Making use of a tool that's designed to inform decisions across the movement with data is awesome, and we want to see positive feedback loops where more uses of the tool encourage .. more uses of the tool. When folks see WLM using WikiMetrics, they'll want to use it for their event, or their next online campaign. It could help us create a greater sense of seriousness about data. It's not so much about accountability (checkuser-style) but visibility (recentchanges-style) which can increase adoption.
With regard to legal issues, I'm not convinced that exposing a simple cohort name like "Berlin Editathon" triggers the kinds of issues we've discussed with Luis, but we should certainly get signoff if we do this. The privacy issues potentially come into play when we disclose cohort _membership_, but cohort names should be pretty low-risk. What's the kind of exposure you're worried about?
With regard to abuse, wouldn't a simple "block this user and flag these log entries as hidden" feature take care of that? And it seems to me we'd like to have the blocking capability anyway to deal with abuse.
Anyway, I don't mean to make a huge deal of it - I think it'd be a nice touch, and help encourage use of tool by creating greater visibility for the community.
Heh. I sense that a proposal that doesn't have strong backing from any of the Big Ds is not going to progress very far. I'm holding out hope that Dan will weigh in. ;-) In seriousness, here's why I think it's worth doing:
Happy to :) I agree it's a good idea - because of a potential positive feedback loop. I think there are fairly simple technical solutions to most concerns.
1. Dario's concern that this log will lead people to believe they're being tracked, watched, check-usered, etc.
Solution: we make it opt-out with a simple flag set on a user preferences page. Or we allow people three choices: * don't log my research * log my research but don't show my email address * log my research and show my email address
2. Diederik's concern that this might disclose sensitive information. I agree, btw, that this should get sign-off from legal.
Solution: we already have a "public" flag on cohorts, but it's hidden. We surface the flag in the UI. If a cohort is not public, we redact its name in the log.
I don't mean to make this sound simple, but it's not hard. I would also like to say that the Real Big Ds probably have other things in mind like the huge and growing backlog of very important other things we have to do.
little d
Dario Taraborelli, 30/10/2013 03:55:
I can impersonate another person by creating a fake email address and upload a cohort called “suspicious_terrorists_in_itwiki” which will result in a nicely log entry and story.
For what it's worth, you already can (as in are able to) do this on wiki: just create a subpage of your userpage with a so called "dossier" (it happens now and then). The log would be like one such page of which you can only see the title and not the content.
Nemo