Does this mean:
1. Same IP, different hashes; 2. Different IPs, same hash; 3. Both?
(I imagine just 1, MD5 isn't /that/ crap at collision resistance, but.)
On 15 September 2015 at 15:50, Dan Andreescu dandreescu@wikimedia.org wrote:
When we process Event Logging events, we hash the origin IP address and add it to the event as part of the "capsule. We salt the hash function and rotate the salt frequently for security, but within those periods of time the same IP would get hashed to the same hash, and some people depended on that.
We recently made the Event Logging processor parallel, and we accidentally forgot to make this hashing consistent across all the parallel instances. So from September 10, 2015 until we fix the bug, client IPs will not be hashed consistently.
We are tracking this issue here: https://phabricator.wikimedia.org/T112688
If you have some data crunching that's affected by this, come talk to us. We are already adding a temporary fix to the scripts that generate the edit-analysis dashboard [1]
[1] https://edit-analysis.wmflabs.org/compare/
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics