Hi,
As part of my first assignment, I'll recompute our historical webrequest
dataset, adding client_ip and geocoded information.
While it seems correct to compute historical client_ip based on the
existing ip and the x_forwarded_for, the use of the current state of the
geocoded maxmind library to compute historical data is more error-prone.
I can either compute it anyway, knowing that there'll be some errors, or
put null values for data older than a given point in time.
I'll launch the script to recompute the data as soon as max(a consensus is
find on this matter, operations gives me the right to run the script) :)
Thanks
--
*Joseph Allemandou*
Data Engineer @ Wikimedia Foundation
IRC: joal