Hi again,
This bug caused yet another delay in EventLogging data in Hive that I
thought I should let you know about. A few hours on December 17th were not
refined until today. If you have any non-Oozie jobs built on this data,
you may need to check up on them in case they need to be rerun.
More info here:
https://phabricator.wikimedia.org/T213602. Sorry for the
inconvenience.
-Andrew Otto
Systems Engineer, WMF
On Fri, Dec 14, 2018 at 3:48 PM Andrew Otto <otto(a)wikimedia.org> wrote:
Hi all,
A bug in the code that imports EventLogging data into Hive caused top 3
level EventCapsule <https://meta.wikimedia.org/wiki/Schema:EventCapsule>
fields to be set to NULL in all Hive EventLogging tables
since 2018-11-29T17:00:00. The affected fields were recvFrom, seqId, and
(more importantly) userAgent.
We've fixed the bug, and are backfilling the data now.
https://phabricator.wikimedia.org/T211833 has more info.
Sorry for the inconvenience! Follow the phabricator ticket to get updates
on when backfilling has completed.
-Andrew Otto
Systems Engineer, WMF