Hi,
You may have followed the discussion on Wikimedia-l (and enwiki-l).
For a mere intellectual curiosity I would like to know why hashing the IPs
with a varying salt won't work.
Wouldn't that provide a way to obfuscate IP addresses while maintaining
uniqueness (i. e. a given IP gets alway hashed to the same hash).
Tim said in a message on enwiki-l that he has looked into the matter but
haven't found any satisfying solution.
So what's the problem with salted hashes?
Note: I have read something about hashing but I am far from being an
expert, please assume I am the classical layman.
Thanks in advance to anyone who will take the time to explain.
C
---------- Messaggio inoltrato ----------
Da: "Lila Tretikov" <lila(a)wikimedia.org>
Data: 05/Apr/2015 11:30
Oggetto: Re: [Wikimedia-l] Announcing: The Wikipedia Prize!
A: "Wikimedia Mailing List" <wikimedia-l(a)lists.wikimedia.org>
Cc:
All,
As Tim mentioned we are seriously looking at
privacy/identity/security/anonymity issues, specifically as it pertains to
IP address exposure -- both from legal and technical standpoint. This won't
happen overnight as we need to get people to work on this and there are a
lot of asks, but this is on our radar.
On a related note, let's skip the sarcasm and treat each other with
straightforward honestly. And for non-English speakers -- who are also (if
not more) in need of this -- sarcasm can be very confusing.
Thanks,
Lila
On Fri, Apr 3, 2015 at 4:02 PM, Cristian Consonni <kikkocristian(a)gmail.com>
wrote:
Hi Brian,
2015-03-30 0:25 GMT+02:00 Brian <reflection(a)gmail.com>om>:
> Although the initial goal of the Netflix Prize was to design a
> collaborative filtering algorithm, it became notorious when the data was
> used to de-anonymize Netflix users. Researchers proved that given just a
> user's movie ratings on one site, you can plug those ratings into
another
> site, such as the IMDB. You can then take that
information, and with
some
> Google searches and optionally a bit of cash (for
websites that sell
user
information,
including, in some cases, their SSN) figure out who they
are.
You could even drive up to their house and take a
selfie with them, or
follow them to work and meet their boss and tell them about their views
on
the topics they were editing.
somewhat tangentially, and to bring back this to topic to a more
scientific setting I would like to point out that there has already
been reasearch in the past on this topic.
I highly recommend reading the following paper:
Lieberman, Michael D., and Jimmy Lin. "You Are Where You Edit:
Locating Wikipedia Contributors through Edit Histories." ICWSM. 2009.
(PDF <
http://www.pensivepuffin.com/dwmcphd/syllabi/infx598_wi12/papers/wikipedia/…
)
For those of you that don't want to read the whole paper, you can find
a recap of the most relevant findings in this presentation by Maurizio
Napolitano:
<
http://www.slideshare.net/napo/social-geography-wikipedia-a-quick-overwiew
The main idea is associating spatial coordinates to a Wikipedia
articles when possible, this articles are called "geopages". Then you
extract from the history of articles the users which have edited a
geopage. If you plot the geopages edited by a given contributor you
can see that they tend to cluster, so you can define an "edit area".
The study finds that 30-35% of contributors concentrate their edits in
an edit area smaller than 1 deg^2 (~12,362 km^2, approximately the
area of Connecticut or Northern Ireland[1] (thanks, Wikipedia!)).
For another free/libre project with a geographic focus like
OpenStreetMap this is even more marked, check out for example this
tool «“Your OSM Heat Map” (aka Where did you contribute?)»[2] by
Pascal Neis.
This, of course, is not a straightforward de-anonimization but this
methods work in principle for every contributor even if you obfuscate
their IP or username (provided that you can still assign all the edits
from a given user to a unique and univocal identifier)
C
[1]
https://en.wikipedia.org/wiki/Square_degree
[2a]
http://yosmhm.neis-one.org/
[2b]
http://neis-one.org/2011/08/yosmhm/
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>