IF proxy_ip does not start with "127.0" or "192.168" or "10." or "169.254":

There are other special/private IP blocks that you'll probably want filtered. This RFC contains the full list for IPv4: https://tools.ietf.org/html/rfc5735#section-4 And the equivalent for IPv6: https://tools.ietf.org/html/rfc5156

On Tue, Jan 20, 2015 at 9:03 AM, Ananth RK <ananthrk@ymxdata.com> wrote:
We currently have a method in Geo UDF that takes an IP address from the remote host header and the X-Forwarded-For value and attempts to identify the originating client IP address by following a simple algorithm as follows:

GetClientIP(remote_host, X-Forwarded-For)

IF X-Forwarded-For value is not valid:
    return remote_host
ELSE:
    FOR EACH valid IP address, proxy_ip, in the comma-separated X-Forwarded-For value:
    IF proxy_ip does not start with "127.0" or "192.168" or "10." or "169.254":
        return proxy_ip

What are the ways in which this naive algorithm can be improved? For example, is it better to maintain a separate list of IP address to ignore (currently only 4)? If yes, how do we ensure that the list is exhaustive? Any other improvements?

Thanks,
Ananth

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics