EventLogging backlog

List overview All Threads
Download

newer

older

Score

Deprecating...

Oliver Keyes

12 Jan 2016 12 Jan '16

9:01 p.m.

Hey yo,

Just a note that EventLogging had replication problems and needed to be backfilled yesterday. This means that if you had scripts running early this morning over EventLogging data from yesterday or the last few days, you're probably gonna need to rerun them and should check whether you need to.

-- Oliver Keyes Count Logula Wikimedia Foundation

Show replies by date

Oliver Keyes

12 Jan 12 Jan

9:06 p.m.

Clarification; it's backfilling from the database consumer's POV, but no data actually got dropped. It was just replication lag :)

On 12 January 2016 at 10:01, Oliver Keyes okeyes@wikimedia.org wrote:

...

Hey yo,

Just a note that EventLogging had replication problems and needed to be backfilled yesterday. This means that if you had scripts running early this morning over EventLogging data from yesterday or the last few days, you're probably gonna need to rerun them and should check whether you need to.

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

Oliver Keyes

13 Jan 13 Jan

10:16 p.m.

Update: still backlogged, be aware if you're relying on EL for day-to-day events.

On 12 January 2016 at 10:06, Oliver Keyes okeyes@wikimedia.org wrote:

...

Clarification; it's backfilling from the database consumer's POV, but no data actually got dropped. It was just replication lag :)

On 12 January 2016 at 10:01, Oliver Keyes okeyes@wikimedia.org wrote:

...
Hey yo,

Just a note that EventLogging had replication problems and needed to be backfilled yesterday. This means that if you had scripts running early this morning over EventLogging data from yesterday or the last few days, you're probably gonna need to rerun them and should check whether you need to.

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

Oliver Keyes

15 Jan 15 Jan

9:37 p.m.

Update: partial resolution thus far. Schemas producing fewer than 1,000 events until the replication script gets to them (i.e. most smaller ones) are now working again. Others have lag. You should check your tables, basically.

Many thanks to Nuria and Mr Otto for resolving so much of the problem; it's a very FUD-like process and their ability to cut through it with clarity is most admirable :).

On 13 January 2016 at 11:16, Oliver Keyes okeyes@wikimedia.org wrote:

...

Update: still backlogged, be aware if you're relying on EL for day-to-day events.

On 12 January 2016 at 10:06, Oliver Keyes okeyes@wikimedia.org wrote:

...
Clarification; it's backfilling from the database consumer's POV, but no data actually got dropped. It was just replication lag :)

On 12 January 2016 at 10:01, Oliver Keyes okeyes@wikimedia.org wrote:

...
Hey yo,

Just a note that EventLogging had replication problems and needed to be backfilled yesterday. This means that if you had scripts running early this morning over EventLogging data from yesterday or the last few days, you're probably gonna need to rerun them and should check whether you need to.

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

Oliver Keyes

18 Jan 18 Jan

11:37 p.m.

Monday update!

Jaime is looking into the problem and you can see the commentary and regular updates at https://phabricator.wikimedia.org/T123634 . It looks like many many long-running queries are gradually accumulating the lag, and Faidon's commentary on the Ops list was accurate. So, please keep your queries short or on Quarry if you possibly can.

In the long-term I suspect we want a second box, so that we have "all the databases up to date" to draw from for reporting and "all the databases maybe a bit lagged" for the queries that take a while to run, but we shall see what we shall see. Thanks to Andrew and Nuria for keeping on this and Jaime for jumping right back in so soon after returning from holiday.

On 15 January 2016 at 10:37, Oliver Keyes okeyes@wikimedia.org wrote:

...

Update: partial resolution thus far. Schemas producing fewer than 1,000 events until the replication script gets to them (i.e. most smaller ones) are now working again. Others have lag. You should check your tables, basically.

Many thanks to Nuria and Mr Otto for resolving so much of the problem; it's a very FUD-like process and their ability to cut through it with clarity is most admirable :).

On 13 January 2016 at 11:16, Oliver Keyes okeyes@wikimedia.org wrote:

...
Update: still backlogged, be aware if you're relying on EL for day-to-day events.

On 12 January 2016 at 10:06, Oliver Keyes okeyes@wikimedia.org wrote:

...
Clarification; it's backfilling from the database consumer's POV, but no data actually got dropped. It was just replication lag :)

On 12 January 2016 at 10:01, Oliver Keyes okeyes@wikimedia.org wrote:

...
Hey yo,

Just a note that EventLogging had replication problems and needed to be backfilled yesterday. This means that if you had scripts running early this morning over EventLogging data from yesterday or the last few days, you're probably gonna need to rerun them and should check whether you need to.

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

Madhumitha Viswanathan

11:39 p.m.

Thanks for keeping the list updated on this, Oliver. You are awesome :)

On Mon, Jan 18, 2016 at 12:37 PM, Oliver Keyes okeyes@wikimedia.org wrote:

...

Monday update!

Jaime is looking into the problem and you can see the commentary and regular updates at https://phabricator.wikimedia.org/T123634 . It looks like many many long-running queries are gradually accumulating the lag, and Faidon's commentary on the Ops list was accurate. So, please keep your queries short or on Quarry if you possibly can.

In the long-term I suspect we want a second box, so that we have "all the databases up to date" to draw from for reporting and "all the databases maybe a bit lagged" for the queries that take a while to run, but we shall see what we shall see. Thanks to Andrew and Nuria for keeping on this and Jaime for jumping right back in so soon after returning from holiday.

On 15 January 2016 at 10:37, Oliver Keyes okeyes@wikimedia.org wrote:

...
Update: partial resolution thus far. Schemas producing fewer than 1,000 events until the replication script gets to them (i.e. most smaller ones) are now working again. Others have lag. You should check your tables, basically.

Many thanks to Nuria and Mr Otto for resolving so much of the problem; it's a very FUD-like process and their ability to cut through it with clarity is most admirable :).

On 13 January 2016 at 11:16, Oliver Keyes okeyes@wikimedia.org wrote:

...
Update: still backlogged, be aware if you're relying on EL for day-to-day events.

On 12 January 2016 at 10:06, Oliver Keyes okeyes@wikimedia.org wrote:

...
Clarification; it's backfilling from the database consumer's POV, but no data actually got dropped. It was just replication lag :)

On 12 January 2016 at 10:01, Oliver Keyes okeyes@wikimedia.org

wrote:

...
...
...
...
Hey yo,

Just a note that EventLogging had replication problems and needed to be backfilled yesterday. This means that if you had scripts running early this morning over EventLogging data from yesterday or the last few days, you're probably gonna need to rerun them and should check whether you need to.

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

-- Oliver Keyes Count Logula Wikimedia Foundation

Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

-- --Madhu :)

3237

Age (days ago)

3243

Last active (days ago)

analytics@lists.wikimedia.org

5 comments

2 participants

tags (0)

participants (2)

Madhumitha Viswanathan
Oliver Keyes