On 22 March 2018 at 13:41, Neil Patel Quinn <nquinn@wikimedia.org> wrote:

Both the edit data and pageview data that you're talking about come from the Hadoop-based Analytics Data Lake. However, because of limitations in the underlying MediaWiki application databases that Hive pulls edit data from, the data requires some complex reconstruction and denormalization that takes several days to a week.


Sorry, I garbled that a little. It's more correct to say: "because of limitations in the underlying MediaWiki application databases that are the source of the edit data, the data requires..."