but maybe browser and preferences fingerprinting would be more effective anyway, since: tor.
Probably not as effective as straight up blocking tor as we do now? :P (Although seriously - I would love if we didn't block tor like we do now. However you can't abuse the site with tor when you can't use tor at all)
I'm somewhat doubtful about fingerprinting (Without doing any research on it, so I may be out of tune here). We have millions of users, mostly using commodity software. I'm doubtful we would be able to get a fingerprint specific enough to uniquely identify a single user. Not to mention that a sophisticated attacker would probably be able to easily modify their fingerprint, especially if the fingerprint criteria is open source [OTOH, a sophisticated attacker can get around an IP block too].
The cryptolog approach - This has the property that there's a specific time where all anon identifiers suddenly change (e.g. Midnight every day in the setup cryptolog uses). Having an arbitrary point in time where suddenly identifiers shift is probably an unwanted property. (Although maybe it doesn't matter that much in practice? Someone who actually deals with abuse on wiki would be better able to answer that).
I suppose a related approach could be something like *If this is first time IP edits (recently), make a (pseudo?) random salt for that IP, throw it in memcached with an expiry time of a week *Hash the IP with the salt *Next time IP edits, if salt can be accessed from memcached, use that, and update the expiry time so that it expires a week from this edit, otherwise start over with new salt.
This would have the property that if an IP is continuously editing, their identifier doesn't change, but if they stop editing for a week, then the identifier switches. Still has the downside that in order for someone to effectively make a range block they would have to have checkuser rights (Although perhaps one could make checkuser-lite right that just exposes IPs of anons, which normal admins get access to). Also it would be much harder for admins to notice patterns, such as if a specific subnet seems to be dealing out similar abuse, or if a specific IP has been blocked once a month for the last 2 years.
--bawolff
On 7/29/14, Adam Wight awight@wikimedia.org wrote:
++the EFF for more ideas, they are actively doing great work on so-called perfect forward secrecy.
There are simple things we could do to achieve a better balance between privacy and sockpantsing, such as cryptolog [1], in which IP addresses are hashed using a salt that changes every day. In theory, nobody can reverse the function to reveal the IP, but you can still correlate all of an address's edits for the day, week, or whatever, making CheckUser possible.
IP range blocking obviously needs to happen up-front, before the IP is mangled. I have no suggestions, but maybe browser and preferences fingerprinting would be more effective anyway, since: tor.
-Adam
[1] https://git.eff.org/?p=cryptolog.git;a=summary
On Fri, Jul 11, 2014 at 8:45 AM, Chris Steipp csteipp@wikimedia.org wrote:
On Friday, July 11, 2014, Daniel Kinzler daniel@brightbyte.de wrote:
Am 11.07.2014 17:19, schrieb Tyler Romeo:
Most likely, we would encrypt the IP with AES or something using a configuration-based secret key. That way checkusers can still reverse
the
hash back into normal IP addresses without having to store the mapping
in the
database.
There are two problems with this, I think.
- No forward secrecy. If that key is ever leaked, all IPs become
"plain".
And it will be, sooner or later. This would probably not be obvious, so this feature would instill a false sense of security.
This is probably the biggest issue. Even if we hmac it, it's trivial to brute force the entire ipv4 (and with intelligent assumptions about generation, most of the ipv6) range in seconds, if the key was ever known.
- No range blocks. It's often quite useful to be able to block a range
of
IPs. This is an important tool in the fight against spammers, taking it away would be a problem.
Range blocks, I imagine, would continue working the same way they do. Someone would have to identify the correct range (which is very difficult when administrators can't see IP's), but on submission, we have the IP address to check against the blocks. (Unless someone proposes to store block ranges as hashes, that would definitely get rid of range blocks).
-- daniel
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org javascript:; https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l