Hi all,
My thoughts and opinion around entry-point definition.
While we have as a long-term plan to provide 'on-the-fly per-query computation', for now we pre-aggregate every dataset we want serve, and store it in cassandra to be exposed by restbase.
It means we can't easily provide variable start/end aggregation easily.
We could either
- send every dataset in between the start and end date for a given time granularity level (could be big !).
- use '/top/{project}/{access}/{year}/{month}/{day}' entrypoint for instance, with possibility to skip the 'day' parameter to have full month.
@Thomas:
- As Andrew said, the data we have is pre-aggregated at hour level so far.
- The data is tagged in UTC timezone and we planned that requests would be using that timezone dy default.
- As said in this message, we are thinking of ways to provide better access to data (on the fly computation, lower time granularity and others), and this involves both technical and privacy concern. It will be for future :)
Joseph