On 25/02/16 21:14, Daniel Berger wrote:
Nuria, thank you for pointing out that exporting a save flag for each request will be complicated. I wasn't aware of that.
It would be very interesting to learn how the previous data set's save flag was exported back in 2007.
As I suspected in my offlist post, the save flag was set using the HTTP response code. Here are the files as they were when they were first committed to version control in 2012. I think they were the same in 2007 except for the IP address filter:
vu.awk:
function savemark(url, code) { if (url ~ /action=submit$/ && code == "TCP_MISS/302") return "save" return "-" }
$5 !~ /^(145.97.39.|66.230.200.|211.115.107.)/ { print $3, $9, savemark($9, $6) }
urjc.awk:
function savemark(url, code) { if (url ~ /action=submit$/ && code == "TCP_MISS/302") return "save" return "-" }
$5 !~ /^(145.97.39.|66.230.200.|211.115.107.)/ { print $3, $9, savemark($9, $6), $4, $8 }
-- Tim Starling