Diederik,
Ah I see where the confusion comes form.
My story is, as I said, about squid logs where views and edits both coexist.
Your focus is on recent changes list.
And about publishing, that is why it is important to make geo data not too
pinpoint exact location, see earlier mail.
Also I don't think James wants raw data , he want aggregates based on these
data. Right James?
Erik
From: analytics-bounces(a)lists.wikimedia.org
[mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Diederik van
Liere
Sent: Tuesday, August 13, 2013 5:34 PM
To: A mailing list for the Analytics Team at WMF and everybody who has an
interest in Wikipedia and analytics.
Subject: Re: [Analytics] U.S. state-level editor retention data
It was my understanding that the original request pertained to geocoding of
editors (if that's not the case then my advance apologies).
@James: can you confirm that we are talking about geocoding of editors?
D
That is correct. Also, if it helps, I don't necessarily need *city*-level
information, just state. (For the purposes of this discussion, DC is a state
since its stats would not be aggregated with any other state's.)
James
Hi James,
In general, we are very cautious with geocoding editors and particularly at
a more granular level than the country level and even more cautious when
this data will be published. From a technical point of view, you could
already do it for anonymous editors as their ip addresses are published on
the Wiki itself and in the XML dump files. For logged-in editors we would
have to rely on the RecentChanges table (see
http://www.mediawiki.org/wiki/Manual:Recentchanges_table). However, data in
this table is only accessible for users with the checkuser permission
(
http://meta.wikimedia.org/wiki/CheckUser_policy#CheckUser_status). Hence,
we cannot use this source to geocode editors. Even if the data was available
from a source without such restrictions, then we would still have
restrictions from the WMF Privacy Policy and community expectations
regarding the geocoding of ip addresses.
I am afraid that we have to reject this request based on the fact that we do
not collect this data in a publicly available table and that geocoding
publishing geocoded editor information would violate the Privacy Policy of
the WMF and not match with community expectations regarding the geocoding of
ip addresses.
Maybe we can continue this discussion to see if we can come up with
alternative solutions to your problem?
Best,
Diederik