Forwarding to the analytics list for reference.
---------- Forwarded message ---------
From: Ho Chung <chungho4865(a)gmail.com>
Date: Mon, Mar 15, 2021 at 11:45 AM
Subject: Re: [Analytics] About: refine_webrequest.hql
To: Joseph Allemandou <jallemandou(a)wikimedia.org>
Hello
Thanks for your reply
Because i was research your Analytics team public discuss history and
wikiteah about web request time stamp
https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest
https://phabricator.wikimedia.org/T212529
I have been in doubt at that time, you're used java technology, but your
HIVE version did not support java before October 2018.
The wmf.webrequest file is located in HIVE.
When collecting the privacy data of readership , whether the time stamp
used the reader's computer system clock instead of the Wikipedia computer
server clock when reading and browsing the page
Now I am more clear. On the public discussion page of your analysis team,
said that all the time is utc by Ottomata
It’s just that you technicians don’t want to unify the expression of the
time stamp format, but in fact all of them use UTC
在 2021年3月15日週一 16:14,Joseph Allemandou <jallemandou(a)wikimedia.org> 寫道:
Hi,
the `dt` field is the time in UTC (no timezone specified) at which the
request ends being processed by Varnish.
Cheers
Joseph
On Mon, Mar 15, 2021 at 8:36 AM Luca Toscano <ltoscano(a)wikimedia.org>
wrote:
+A mailing list for the Analytics Team at WMF and
everybody who has an
interest in Wikipedia and analytics. <analytics(a)lists.wikimedia.org>
Hi!
I added the Analytics mailing list in Cc so other people can chime in,
this is the canonical way to follow up with us and the community, please
avoid direct email if possible :)
Thanks!
Luca
On Sat, Mar 13, 2021 at 10:57 PM Ho Chung <chungho4865(a)gmail.com> wrote:
Hello
I have some problem request , about refine_webrequest.hql
In this file timestamp is use utc ?
This file is it connect wmf_raw.webrequest and wmf.webrequest ?
Because i can't read the code have add Z / +/- zone time
-- Hack to get a correct timestamp because of hive inconsistent
conversion
CAST(unix_timestamp(dt, "yyyy-MM-dd'T'HH:mm:ss") * 1.0 as timestamp)
as
ts,
https://github.com/wikimedia/analytics-refinery/blob/master/oozie/webreques…
I emailed wiki legal request 3 month they not sure , can you clearly ask
me .
If not use utc, is use your server clock or , my computer clock?
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Joseph Allemandou (joal) (he / him)
Staff Data Engineer
Wikimedia Foundation
--
Joseph Allemandou (joal) (he / him)
Staff Data Engineer
Wikimedia Foundation