Yes, our Hadoop cluster that processes this data for publishing is under heavy load.  We are working on it, but we don't know when things will get better.  Unless something really bad happens, we should be able to recover from the lag without loss of data.

On Wed, Apr 15, 2015 at 11:49 AM, Antony Jerome <antony.jerome@gmail.com> wrote:
Hi Analytics Team,

First of, thank you for the pagecount-all-sites data that you've been publishing. I've been using it for doing some visualization experiments, and have found it quite helpful.

Prior to yesterday, I'd observed that the pagecount data gets generated in about 1.5 to 2 hours (eg: the timestamp on pagecounts-20150413-000000.gz shows 13-Apr-2015 01:38).

However, this lag appears to have been rising since yesterday, and currently it stands at around 7.5 hours.

http://i.imgur.com/B9C1UgM.png

Would you know what might be causing this? Thanks in advance for your help.

Regards
Antony

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics