Hi all,
My thoughts and opinion around entry-point definition.
While we have as a long-term plan to provide 'on-the-fly per-query computation', for now we pre-aggregate every dataset we want serve, and store it in cassandra to be exposed by restbase. It means we can't easily provide variable start/end aggregation easily.
We could either - send every dataset in between the start and end date for a given time granularity level (could be big !). - use '/top/{project}/{access}/{year}/{month}/{day}' entrypoint for instance, with possibility to skip the 'day' parameter to have full month.
*@Thomas*: - As Andrew said, the data we have is pre-aggregated at hour level so far. - The data is tagged in UTC timezone and we planned that requests would be using that timezone dy default. - As said in this message, we are thinking of ways to provide better access to data (on the fly computation, lower time granularity and others), and this involves both technical and privacy concern. It will be for future :)
Joseph
On Sun, Sep 13, 2015 at 5:39 PM, Andrew Gray andrew.gray@dunelm.org.uk wrote:
On 13 September 2015 at 16:26, Thomas Steiner tomac@google.com wrote:
I mean that somehow I could express getting data in an exact given
period of
time, say, exactly the day September 11, 2015 in the time zone CET (that
day
started at 3pm relative to PDT or 11pm relative to UTC). Without time
zone
support, I would get data “outside” of my desired local time zone. Hope
this
makes sense and is clear.
A cautious note on time zones...
If you're holding everything in one hour bins, as we currently do with the aggregated data, then it's easy enough to switch from UTC to CET to EST and so forth.
But not all time zones differ by one hour increments. Most noticeably, India is on UTC+5:30, and a handful of other places also differ by 30 minutes from the standard (or in the case of Nepal, 45). I'm not sure you could display these without regenerating the underlying data, which would be a lot of added complexity.
--
- Andrew Gray andrew.gray@dunelm.org.uk
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics