IF proxy_ip does not start with "127.0" or "192.168" or "10."
or "169.254":
There are other special/private IP blocks that you'll probably want filtered. This RFC contains the full list for IPv4: https://tools.ietf.org/html/rfc5735#section-4 And the equivalent for IPv6: https://tools.ietf.org/html/rfc5156
On Tue, Jan 20, 2015 at 9:03 AM, Ananth RK ananthrk@ymxdata.com wrote:
We currently have a method in Geo UDF that takes an IP address from the remote host header and the X-Forwarded-For value and attempts to identify the originating client IP address by following a simple algorithm as follows:
GetClientIP(remote_host, X-Forwarded-For)
IF X-Forwarded-For value is not valid: return remote_host ELSE: FOR EACH valid IP address, proxy_ip, in the comma-separated X-Forwarded-For value: IF proxy_ip does not start with "127.0" or "192.168" or "10." or "169.254": return proxy_ip
What are the ways in which this naive algorithm can be improved? For example, is it better to maintain a separate list of IP address to ignore (currently only 4)? If yes, how do we ensure that the list is exhaustive? Any other improvements?
Thanks, Ananth
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics