Does this mean:
1. Same IP, different hashes;
2. Different IPs, same hash;
3. Both?
(I imagine just 1, MD5 isn't /that/ crap at collision resistance, but.)
On 15 September 2015 at 15:50, Dan Andreescu <dandreescu(a)wikimedia.org> wrote:
When we process Event Logging events, we hash the
origin IP address and add
it to the event as part of the "capsule. We salt the hash function and
rotate the salt frequently for security, but within those periods of time
the same IP would get hashed to the same hash, and some people depended on
that.
We recently made the Event Logging processor parallel, and we accidentally
forgot to make this hashing consistent across all the parallel instances.
So from September 10, 2015 until we fix the bug, client IPs will not be
hashed consistently.
We are tracking this issue here:
https://phabricator.wikimedia.org/T112688
If you have some data crunching that's affected by this, come talk to us.
We are already adding a temporary fix to the scripts that generate the
edit-analysis dashboard [1]
[1]
https://edit-analysis.wmflabs.org/compare/
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Oliver Keyes
Count Logula
Wikimedia Foundation