really good summary of the situation, Neil, I'm bookmarking this and will re-use it when people ask :)

On Thu, Mar 22, 2018 at 7:07 AM, Neil Patel Quinn <nquinn@wikimedia.org> wrote:
On 22 March 2018 at 13:41, Neil Patel Quinn <nquinn@wikimedia.org> wrote:

Both the edit data and pageview data that you're talking about come from the Hadoop-based Analytics Data Lake. However, because of limitations in the underlying MediaWiki application databases that Hive pulls edit data from, the data requires some complex reconstruction and denormalization that takes several days to a week.


Sorry, I garbled that a little. It's more correct to say: "because of limitations in the underlying MediaWiki application databases that are the source of the edit data, the data requires..."

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics