A short post-mortem of the problem with missing notification data. tl;dr: the drop is not a real one and is being fixed.
As some of you noticed, we had a sudden drop in Echo notification counts across all wikis for some categories of events as of November 7 [1]. The drop was only in the EventLogging data, not in the actual delivery of notifications on site or by mail.
Here’s what happened:
* a change was made to Echo’s instrumentation to log the revid associated with a notification [2] * when the change got merged, the schema ID was not updated in the config file * as a result, events started getting validated against a stale schema, which caused most types of notifications (anything except for page-linked and welcome) to fail validation
These are the next steps we’re considering
* Benny wrote a patch to bump the schema id [3] it hasn’t been pushed to production yet, but when it does we will have valid data in the MySQL log DB again (in a table called Echo_6081131) * We have a copy of the raw events stored as JSON on stat1, we may want to restore this data and import it into the DB * After the new log is in production, we will update the scripts generating the data dumps to query a union of tables (across multiple schema IDs). This is one of the potential limitations of using SQL as a store from which to extract EventLogging data to generate dashboards. * We will need to figure out whether passing additional fields that are not required and not specified in a schema should invalidate an event or we should relax the validation criteria.
Dario
[1] http://ee-dashboard.wmflabs.org/graphs/frwiki_echo_category [2] https://bugzilla.wikimedia.org/show_bug.cgi?id=46045 [3] https://gerrit.wikimedia.org/r/#/c/96901/
On Nov 21, 2013, at 2:51 PM, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
yep, the majority of events are failing to validate against Schema:Echo since Nov 7. I’ll send a summary of our findings in a moment.
On Nov 21, 2013, at 2:00 PM, Ori Livneh ori@wikimedia.org wrote:
On Nov 21, 2013, at 12:50 PM, Steven Walling swalling@wikimedia.org wrote:
On Thu, Nov 21, 2013 at 12:49 PM, Jan Ainali jan.ainali@wikimedia.se wrote:
What happened with the Notifications statistiscs on Swedish WP on Nov 8?
http://ee-dashboard.wmflabs.org/dashboards/svwiki-features
No thanks, mentions, talk, reverts or review notfications since then, only link and system.
This problem is not unique to svwiki, you can see it also in de, it, etc.
Dario will know more I think?
On Thu, Nov 21, 2013 at 12:58 PM, Dario Taraborelli dtaraborelli@wikimedia.org wrote: good catch, I’ll look into this
ERROR:root:Unable to decode: 159611 EventLogging {"event":{"version":"1.5","eventId":1641964,"notificationType":"reverted","notificationGroup":"negative","sender":"XXX","recipientUserId":XXX,"recipientEditCount":XXX,"deliveryMethod":"web","revisionId":XXX},"schema":"Echo","revision":XXX,"clientValidated":false,"wiki":"frwiki","recvFrom":"mw1163","timestamp":1385071015,"webHost":"fr.wikipedia.org"} ValidationError: Additional properties are not allowed (u'revisionId' was unexpected)
(XXX = redacted)
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Thanks, Dario!
We really appreciate your help in investigating this issue and coming up with this reasonable plan to resolve it.
We live and learn -- and pay tuition :)
Thanks again for making this important service possible ...
Fabrice
On Nov 21, 2013, at 4:26 PM, Dario Taraborelli wrote:
A short post-mortem of the problem with missing notification data. tl;dr: the drop is not a real one and is being fixed.
As some of you noticed, we had a sudden drop in Echo notification counts across all wikis for some categories of events as of November 7 [1]. The drop was only in the EventLogging data, not in the actual delivery of notifications on site or by mail.
Here’s what happened:
- a change was made to Echo’s instrumentation to log the revid associated with a notification [2]
- when the change got merged, the schema ID was not updated in the config file
- as a result, events started getting validated against a stale schema, which caused most types of notifications (anything except for page-linked and welcome) to fail validation
These are the next steps we’re considering
- Benny wrote a patch to bump the schema id [3] it hasn’t been pushed to production yet, but when it does we will have valid data in the MySQL log DB again (in a table called Echo_6081131)
- We have a copy of the raw events stored as JSON on stat1, we may want to restore this data and import it into the DB
- After the new log is in production, we will update the scripts generating the data dumps to query a union of tables (across multiple schema IDs). This is one of the potential limitations of using SQL as a store from which to extract EventLogging data to generate dashboards.
- We will need to figure out whether passing additional fields that are not required and not specified in a schema should invalidate an event or we should relax the validation criteria.
Dario
[1] http://ee-dashboard.wmflabs.org/graphs/frwiki_echo_category [2] https://bugzilla.wikimedia.org/show_bug.cgi?id=46045 [3] https://gerrit.wikimedia.org/r/#/c/96901/
On Nov 21, 2013, at 2:51 PM, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
yep, the majority of events are failing to validate against Schema:Echo since Nov 7. I’ll send a summary of our findings in a moment.
On Nov 21, 2013, at 2:00 PM, Ori Livneh ori@wikimedia.org wrote:
On Nov 21, 2013, at 12:50 PM, Steven Walling swalling@wikimedia.org wrote:
On Thu, Nov 21, 2013 at 12:49 PM, Jan Ainali jan.ainali@wikimedia.se wrote:
What happened with the Notifications statistiscs on Swedish WP on Nov 8?
http://ee-dashboard.wmflabs.org/dashboards/svwiki-features
No thanks, mentions, talk, reverts or review notfications since then, only link and system.
This problem is not unique to svwiki, you can see it also in de, it, etc.
Dario will know more I think?
On Thu, Nov 21, 2013 at 12:58 PM, Dario Taraborelli dtaraborelli@wikimedia.org wrote: good catch, I’ll look into this
ERROR:root:Unable to decode: 159611 EventLogging {"event":{"version":"1.5","eventId":1641964,"notificationType":"reverted","notificationGroup":"negative","sender":"XXX","recipientUserId":XXX,"recipientEditCount":XXX,"deliveryMethod":"web","revisionId":XXX},"schema":"Echo","revision":XXX,"clientValidated":false,"wiki":"frwiki","recvFrom":"mw1163","timestamp":1385071015,"webHost":"fr.wikipedia.org"} ValidationError: Additional properties are not allowed (u'revisionId' was unexpected)
(XXX = redacted)
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee
_______________________________
Fabrice Florin Product Manager Wikimedia Foundation
also, this is a good reminder that we need to set alerts for events that fail to pass validation at an anomalous rate.
On Nov 21, 2013, at 4:26 PM, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
A short post-mortem of the problem with missing notification data. tl;dr: the drop is not a real one and is being fixed.
As some of you noticed, we had a sudden drop in Echo notification counts across all wikis for some categories of events as of November 7 [1]. The drop was only in the EventLogging data, not in the actual delivery of notifications on site or by mail.
Here’s what happened:
- a change was made to Echo’s instrumentation to log the revid associated with a notification [2]
- when the change got merged, the schema ID was not updated in the config file
- as a result, events started getting validated against a stale schema, which caused most types of notifications (anything except for page-linked and welcome) to fail validation
These are the next steps we’re considering
- Benny wrote a patch to bump the schema id [3] it hasn’t been pushed to production yet, but when it does we will have valid data in the MySQL log DB again (in a table called Echo_6081131)
- We have a copy of the raw events stored as JSON on stat1, we may want to restore this data and import it into the DB
- After the new log is in production, we will update the scripts generating the data dumps to query a union of tables (across multiple schema IDs). This is one of the potential limitations of using SQL as a store from which to extract EventLogging data to generate dashboards.
- We will need to figure out whether passing additional fields that are not required and not specified in a schema should invalidate an event or we should relax the validation criteria.
Dario
[1] http://ee-dashboard.wmflabs.org/graphs/frwiki_echo_category [2] https://bugzilla.wikimedia.org/show_bug.cgi?id=46045 [3] https://gerrit.wikimedia.org/r/#/c/96901/
On Nov 21, 2013, at 2:51 PM, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
yep, the majority of events are failing to validate against Schema:Echo since Nov 7. I’ll send a summary of our findings in a moment.
On Nov 21, 2013, at 2:00 PM, Ori Livneh ori@wikimedia.org wrote:
On Nov 21, 2013, at 12:50 PM, Steven Walling swalling@wikimedia.org wrote:
On Thu, Nov 21, 2013 at 12:49 PM, Jan Ainali jan.ainali@wikimedia.se wrote:
What happened with the Notifications statistiscs on Swedish WP on Nov 8?
http://ee-dashboard.wmflabs.org/dashboards/svwiki-features
No thanks, mentions, talk, reverts or review notfications since then, only link and system.
This problem is not unique to svwiki, you can see it also in de, it, etc.
Dario will know more I think?
On Thu, Nov 21, 2013 at 12:58 PM, Dario Taraborelli dtaraborelli@wikimedia.org wrote: good catch, I’ll look into this
ERROR:root:Unable to decode: 159611 EventLogging {"event":{"version":"1.5","eventId":1641964,"notificationType":"reverted","notificationGroup":"negative","sender":"XXX","recipientUserId":XXX,"recipientEditCount":XXX,"deliveryMethod":"web","revisionId":XXX},"schema":"Echo","revision":XXX,"clientValidated":false,"wiki":"frwiki","recvFrom":"mw1163","timestamp":1385071015,"webHost":"fr.wikipedia.org"} ValidationError: Additional properties are not allowed (u'revisionId' was unexpected)
(XXX = redacted)
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
the patch went live today thanks to Ori and Benny and we’re now correctly counting notifications for each category.
I am tempted to just reset all dashboards (including enwiki, dewiki etc) to use data from the new log unless people feel strongly about preserving the entire historical series (which would also require restoring data from the raw JSON log).
Please let me know if there’s any concern with this suggestion.
Dario
On Nov 21, 2013, at 4:26 PM, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
A short post-mortem of the problem with missing notification data. tl;dr: the drop is not a real one and is being fixed.
As some of you noticed, we had a sudden drop in Echo notification counts across all wikis for some categories of events as of November 7 [1]. The drop was only in the EventLogging data, not in the actual delivery of notifications on site or by mail.
Here’s what happened:
- a change was made to Echo’s instrumentation to log the revid associated with a notification [2]
- when the change got merged, the schema ID was not updated in the config file
- as a result, events started getting validated against a stale schema, which caused most types of notifications (anything except for page-linked and welcome) to fail validation
These are the next steps we’re considering
- Benny wrote a patch to bump the schema id [3] it hasn’t been pushed to production yet, but when it does we will have valid data in the MySQL log DB again (in a table called Echo_6081131)
- We have a copy of the raw events stored as JSON on stat1, we may want to restore this data and import it into the DB
- After the new log is in production, we will update the scripts generating the data dumps to query a union of tables (across multiple schema IDs). This is one of the potential limitations of using SQL as a store from which to extract EventLogging data to generate dashboards.
- We will need to figure out whether passing additional fields that are not required and not specified in a schema should invalidate an event or we should relax the validation criteria.
Dario
[1] http://ee-dashboard.wmflabs.org/graphs/frwiki_echo_category [2] https://bugzilla.wikimedia.org/show_bug.cgi?id=46045 [3] https://gerrit.wikimedia.org/r/#/c/96901/
On Nov 21, 2013, at 2:51 PM, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
yep, the majority of events are failing to validate against Schema:Echo since Nov 7. I’ll send a summary of our findings in a moment.
On Nov 21, 2013, at 2:00 PM, Ori Livneh ori@wikimedia.org wrote:
On Nov 21, 2013, at 12:50 PM, Steven Walling swalling@wikimedia.org wrote:
On Thu, Nov 21, 2013 at 12:49 PM, Jan Ainali jan.ainali@wikimedia.se wrote:
What happened with the Notifications statistiscs on Swedish WP on Nov 8?
http://ee-dashboard.wmflabs.org/dashboards/svwiki-features
No thanks, mentions, talk, reverts or review notfications since then, only link and system.
This problem is not unique to svwiki, you can see it also in de, it, etc.
Dario will know more I think?
On Thu, Nov 21, 2013 at 12:58 PM, Dario Taraborelli dtaraborelli@wikimedia.org wrote: good catch, I’ll look into this
ERROR:root:Unable to decode: 159611 EventLogging {"event":{"version":"1.5","eventId":1641964,"notificationType":"reverted","notificationGroup":"negative","sender":"XXX","recipientUserId":XXX,"recipientEditCount":XXX,"deliveryMethod":"web","revisionId":XXX},"schema":"Echo","revision":XXX,"clientValidated":false,"wiki":"frwiki","recvFrom":"mw1163","timestamp":1385071015,"webHost":"fr.wikipedia.org"} ValidationError: Additional properties are not allowed (u'revisionId' was unexpected)
(XXX = redacted)
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics