On Mon, Aug 12, 2013 at 6:46 PM, Erik Zachte <ezachte(a)wikimedia.org> wrote:
Some thought on this:****
** **
We have been discussing adding new geo data for a long time. ****
I lost track of current status and latest decisions but FWIW a year ago
this was the idea for squid log: ****
** **
We thought of replacing ip address by a composite field (using a different
delimiter than the field delimiter).****
The field could look like this:****
** **
4|hash code|CL||Santiago|-33.5,-70.5****
6|hash code|US|CA|San Francisco|-37.5,122.5****
**
**
Where 4 or 6 is the #triplets in ip address. ****
Hash code is anonimized ip address. ****
Country code as used by MaxMind (
http://dev.maxmind.com/geoip/legacy/codes/iso3166/ )****
Region/state when available or else empty string (*)****
City name when available or else empty string (
http://www.maxmind.com/GeoIPCity-534-Location.csv )****
Lastly follow latitude/longitude, rounded on purpose. This gives
resolution of at best 55 km or 30 mi resolution, depending on latitude, to
ensure anonimization particularly for edits. Otherwise a very active editor
in a sparsely populated region of say China could easily be matched with
edit timestamps from dumps.
I don't think we should get too hung up on the specific format right now, I
am really not sure if a composite field is the best implementation and at
what level we want to geocode. But more importantly, I think that two
issues get mixed up here: geocoding of readers and geocoding of editors.
It was my understanding that the original request pertained to geocoding of
editors (if that's not the case then my advance apologies).
@James: can you confirm that we are talking about geocoding of editors?
D
> **
>
> * Caveat: ****
>
> Supplying region code requires 'external lookup' as MaxMind puts it. (
>
http://www.maxmind.com/en/city )****
>
> This is probably a costly operation. ****
>
> ** **
>
> Erik****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> *From:* analytics-bounces(a)lists.wikimedia.org [mailto:
> analytics-bounces(a)lists.wikimedia.org] *On Behalf Of *James Hare
> *Sent:* Sunday, August 11, 2013 1:55 PM
> *To:* A mailing list for the Analytics Team at WMF and everybody who has
> an interest in Wikipedia and analytics.
> *Subject:* Re: [Analytics] U.S. state-level editor retention data****
>
> ** **
>
> That will work. Cheers!****
>
> ** **
>
> ** **
>
> On Aug 10, 2013, at 9:21 AM, Toby Negrin wrote:****
>
> ****
>
> Hi James,****
>
> ** **
>
> We can take a look at this -- the next step for WikiMetrics is to expand
> the reporting capabilities. The developer with the most context is out
> until Wednesday; we should be able to get back to you by the end of the
> week with an estimate of how difficult it would be to implement this
> changes.****
>
> ** **
>
> Will that work?****
>
> ** **
>
> -Toby****
>
> ** **
>
> ** **
>
> ** **
>
> On Sat, Aug 10, 2013 at 4:07 AM, Wikimedia DC <james.hare(a)wikidc.org>
> wrote:****
>
> Greetings,
>
> I am James Hare, president of the Washington, DC chapter. At Wikimania I
> have been learning about the editor retention data the Wikimedia Foundation
> has been collecting and analyzing. I was discussing it with Ryan Kaldari
> and he noted that while the data was available at the national level, it
> was not yet available at the state level.
>
> How difficult would it be to implement state-level analysis? Would it just
> be a matter of simply changing the geolocation lookup code, or would it be
> a very expensive change that would benefit relatively few people? For
> Wikimedia DC's sake I am interested in data for the District of Columbia,
> Maryland, Delaware, Virginia, and West Virginia (our defined chapter
> region).
> Regards,
> James Hare
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/analytics****
>
> ** **
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/analytics****
>
> ** **
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/analytics