We've deployed the change to bucketing, but we are still seeing the same issue in the collected data.
Again we are generating a unique 64 bit random number when the user gets to the page. We are seeing this same 64 bit unique number being reported by multiple ip addresses.
Since deploying the new schema number with the updated bucket selection we have seen 13 distinct tokens coming from 42 distinct ip addresses. This shouldn't be possible.
mysql:research@analytics-store.eqiad.wmnet [log]> select count(distinct clientIp) from CompletionSugges
+--------------------------+
| count(distinct clientIp) |
+--------------------------+
+--------------------------+
mysql:research@analytics-store.eqiad.wmnet [log]> select count(distinct event_pageViewToken) from CompletionSuggestions_13630018;
+-------------------------------------+
| count(distinct event_pageViewToken) |
+-------------------------------------+
+-------------------------------------+
My best guess at this point is that something has changed in the way these clientIp's are collected and is incorrect.