Diederik,

 

Ah I see where the confusion comes form.

My story is, as I said, about squid logs where views and edits both coexist.

Your focus is on recent changes list.

 

And about publishing, that is why it is important to make geo data not too pinpoint exact location, see earlier mail.

 

Also I don't think James wants raw data , he want aggregates based on these data. Right James?

 

Erik

 

From: analytics-bounces@lists.wikimedia.org [mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Diederik van Liere
Sent: Tuesday, August 13, 2013 5:34 PM
To: A mailing list for the Analytics Team at WMF and everybody who has an interest in Wikipedia and analytics.
Subject: Re: [Analytics] U.S. state-level editor retention data

 

It was my understanding that the original request pertained to geocoding of editors (if that's not the case then my advance apologies).

 

@James: can you confirm that we are talking about geocoding of editors?

D

 

That is correct. Also, if it helps, I don't necessarily need *city*-level information, just state. (For the purposes of this discussion, DC is a state since its stats would not be aggregated with any other state's.)

 

James

 

Hi James,

 

In general, we are very cautious with geocoding editors and particularly at a more granular level than the country level and even more cautious when this data will be published. From a technical point of view, you could already do it for anonymous editors as their ip addresses are published on the Wiki itself and in the XML dump files. For logged-in editors we would have to rely on the RecentChanges table (see http://www.mediawiki.org/wiki/Manual:Recentchanges_table). However, data in this table is only accessible for users with the checkuser permission (http://meta.wikimedia.org/wiki/CheckUser_policy#CheckUser_status). Hence, we cannot use this source to geocode editors. Even if the data was available from a source without such restrictions, then we would still have restrictions from the WMF Privacy Policy and community expectations regarding the geocoding of ip addresses. 

 

I am afraid that we have to reject this request based on the fact that we do not collect this data in a publicly available table and that geocoding publishing geocoded editor information would violate the Privacy Policy of the WMF and not match with community expectations regarding the geocoding of ip addresses. 

 

Maybe we can continue this discussion to see if we can come up with alternative solutions to your problem?

 

Best,

Diederik