Hey all,
it seems the dispatcher stopped working about 17-18 hours ago (i.e. Friday 5pm Berlin / 8am SF). Could we please, as fast as possible investigate and get it running again?
(DNS changes have been suggested as the culprits, I have no clue)
Cheers, Denny
Denny and I are discussing it on IRC. It seems all the dispatch operations started fatal-ing at around 15:10 yesterday.
Fatal error: Cannot use object of type Diff\DiffOpChange as array in /usr/local/apache/common-local/php-1.22wmf9/extensions/Wikibase/lib/maintenance/dispatchChanges.php on line 858
From: Denny Vrandečić [mailto:denny.vrandecic@wikimedia.de] Sent: 13 July 2013 09:34 To: Wikidata-intern; Wikidata tech; Sam Reed Subject: Dispatch script stopped working? [urgent]
Hey all,
it seems the dispatcher stopped working about 17-18 hours ago (i.e. Friday 5pm Berlin / 8am SF). Could we please, as fast as possible investigate and get it running again?
(DNS changes have been suggested as the culprits, I have no clue)
Cheers,
Denny
For now it seems running again, let's see if it will finish the backlog and not choke again. We manually bumped up the last change ID by 5000. A full chat log is attached.
This is not a post mortem yet, we need
Outstanding actions: * try to figure out why the dispatcher choked in the first place ** fix that, if we can figure it out * try to make the Dispatcher script more robust in such and similar cases
Current effects: * about 5000 edits were not dispatched, they were skipped * there is currently a dispatch log of 18 hours. It is expected to go down quickly, if the dispatcher does not choke again * Language links and pages using Wikidata data in the Wikipedias might be outdated by this time * Current Wikidata editing activity is rather low right now (about 100 edits per minute), so this should help with getting the dispatching back in time
Cheers, Denny
Denny_WMDE
the dispatcher seems to have stopped working yesterday
10:48
Denny_WMDE
i am just looking at the stats, and it is not moving, and seems to has not since yesterday evening
10:48
Denny_WMDE
https://www.wikidata.org/wiki/Special:DispatchStats
10:49
Denny_WMDE
i don't know what got changed, or what happened, but this means that all updates to the wikipedias are currently stuck
10:49
Denny_WMDE
i saw reports on that from the community
10:49
Denny_WMDE
i am not sure how to proceed
10:49
Reedy
I'll look at the logs for it
10:50
Denny_WMDE
thank you
10:50
Reedy
Why is there wikibase-dispatch-changes and wikibase-dispatch-changes2?
10:50
Reedy
10:50
Reedy
They look identicle
10:50
Reedy
identical
10:50
Denny_WMDE
i don't know, sorry. first time i hear of that.
10:50
Denny_WMDE
aude and danielk are both not here right now.
10:51
Reedy
Denny_WMDE: It keeps fatal-ing
10:52
Reedy
Fatal error: Cannot use object of type Diff\DiffOpChange as array in /usr/local/apache/common-local/php-1.22wmf9/extensions/Wikibase/lib/maintenance/dispatchChanges.php on line 858
10:52
Denny_WMDE
what?
10:52
Denny_WMDE
has there been a deployment yesterday?
10:52
Reedy
Not obviously
10:53
Reedy
I did a wikidata folder sync on the 11th
10:53
Denny_WMDE
yes, but the errors started on the 12th?
10:54
Denny_WMDE
i am checking the code at that place
10:54
Reedy
Need to uncompress some log files to check
10:54
Reedy
Denny_WMDE: Yup. Was fine at the start of the 12th, not by the end
10:57
Denny_WMDE
thank you
10:57
Denny_WMDE
so it keeps throwing these fatals right now?
10:58
Reedy
Yup
10:58
Denny_WMDE
hmm, i have an idea about the culprit
10:58
Denny_WMDE
could someone else have made a deploy yesterday? that looks like code that was deployed to stop the aliases of being deleted
10:59
Reedy
https://wikitech.wikimedia.org/wiki/Server_Admin_Log
11:00
Reedy
I can't see a scap/full code tree sync
11:00
Reedy
Started at 171252
11:01
Reedy
Uhh, that's a line number
11:01
Denny_WMDE
??
11:01
Reedy
Hmm
11:02
Reedy
15:09:18 Posted 200 changes to snwiki, up to ID 55754436, timestamp 20130712150916. Lag is 2 seconds. Next ID is 55754436.
11:02
Reedy
Fatal error: Cannot use object of type Diff\DiffOpChange as array in /usr/local/apache/common-local/php-1.22wmf9/extensions/Wikibase/lib/maintenance/dispatchChanges.php on line 858
11:02
Reedy
15:09:19 Posted 200 changes to diqwiki, up to ID 55754366, timestamp 20130712150907. Lag is 12 seconds. Next ID is 55754366.
11:02
Reedy
Then there was a few more working
11:03
Reedy
then it all died
11:03
Reedy
15:10 or so
11:03
Reedy
15:20 Krinkle: Resetting commons:User:CommonsDelinker account e-mailaddress (bug 51016)
11:03
Reedy
10:43 mark: Sending European wikidata and wikivoyage traffic to esams
11:03
Reedy
Neither are relevant
11:03
Denny_WMDE
no
11:03
Denny_WMDE
i am trying to figure out a few things
11:04
Reedy
Denny_WMDE: Ooh. I see a profiling bug
11:09
Denny_WMDE
weird, that diff looks pretty normal
11:10
Denny_WMDE
a profiling bug?
11:10
Reedy
Missing wfProfileOut
11:10
Reedy
I'll fix and commit
11:10
Denny_WMDE
ah it does not profileout
11:10
Denny_WMDE
ok thanks
11:10
Denny_WMDE
but that's not the culprit, right?
11:10
Reedy
I hope not!
11:12
Denny_WMDE
it is this change it is choking on right now http://www.wikidata.org/w/index.php?title=Q11076586&diff=prev&oldid=...
11:12
Denny_WMDE
which does not make any sense
11:13
Denny_WMDE
but we were working yesterday on some changes to the aliases
11:13
Denny_WMDE
that might have changed some diffops, but there was no plan to deploy that afaik
11:13
gerrit-wm
New patchset: Reedy; "Add missing wfProfileOut() call" [mediawiki/extensions/Wikibase] (master) - https://gerrit.wikimedia.org/r/73562
11:13
gerrit-wm
Change merged: Denny Vrandecic; [mediawiki/extensions/Wikibase] (master) - https://gerrit.wikimedia.org/r/73562
11:14
zz_YuviPanda is now known as YuviPanda
11:20
Denny_WMDE
the involved code has been changed in march resp. in may the last time
11:21
Denny_WMDE
11:21
Reedy
The comment is also seemingly wrong.. * @return \Diff\MapDiff|bool
11:23
Reedy
Bug getting Diff\DiffOpChang
11:24
Reedy
*But
11:24
Denny_WMDE
yes, i am looking into this
11:26
Denny_WMDE
meh, i have a stupid idea which doesn't fix anything probably, so be free to say no — is it possible to simply bump by this change and go to the next? just see if it is choking on this diff for some reason, or if something else is going on
11:32
Denny_WMDE
Reedy: ^
11:33
Denny_WMDE
JeroenDeDauw: ping
11:33
Reedy
Presuambly..
11:33
Reedy
Where does it need removing from or whatever?
11:33
Denny_WMDE
the queue for each wiki needs to be bumped by 1
11:33
Reedy
UPDATE wb_changes_dispatch SET chd_seen = chd_seen +1;
11:35
Reedy
?
11:35
Denny_WMDE
uhmmm… (blush)
11:36
Denny_WMDE
i do not know...
11:36
Reedy
chd_seen INT NOT NULL DEFAULT 0, -- last change ID examined (dispatch state)
11:36
Denny_WMDE
sounds good
11:36
Denny_WMDE
i would try...
11:36
Reedy
Query OK, 287 rows affected (0.00 sec)
11:36
Reedy
Rows matched: 287 Changed: 287 Warnings: 0
11:36
Denny_WMDE
and see if the fatals change accordingly, or if they even dsisappear
11:36
Denny_WMDE
sounds good
11:36
Reedy
Guess we might have to wait upto 5 minutes or so
11:37
Denny_WMDE
ok
11:37
Reedy
well, just over 2 minutes based on the current time
11:37
Denny_WMDE
they are bumped up
11:39
Reedy
09:40:01 Starting loop for unlimited passes or 900 seconds
11:40
Reedy
Fatal error: Cannot use object of type Diff\DiffOpChange as array in /usr/local/apache/common-local/php-1.22wmf9/extensions/Wikibase/lib/maintenance/dispatchChanges.php on line 858
11:40
Denny_WMDE
the last idea is to bump it up by 5000
11:40
Reedy
I can try that
11:40
Reedy
Ahh
11:42
Reedy
Done that, now running it manually..
11:42
Reedy
Over 66000 seconds of lag
11:42
Denny_WMDE
so it runs again?
11:42
Reedy
yup
11:42
Denny_WMDE
ok i am noting down the number range
11:43
Denny_WMDE
of diffs
11:43
Denny_WMDE
we have nailed it down to 5000 diffs, can't be that hard to find the relevant one…
11:43
Reedy
heh
11:44
Denny_WMDE
thank you very much for your help, let's see if it chokes again or if it catches up now
2013/7/13 Sam Reed reedy@wikimedia.org
Denny and I are discussing it on IRC. It seems all the dispatch operations started fatal-ing at around 15:10 yesterday.****
Fatal error: Cannot use object of type Diff\DiffOpChange as array in /usr/local/apache/common-local/php-1.22wmf9/extensions/Wikibase/lib/maintenance/dispatchChanges.php on line 858****
*From:* Denny Vrandečić [mailto:denny.vrandecic@wikimedia.de] *Sent:* 13 July 2013 09:34 *To:* Wikidata-intern; Wikidata tech; Sam Reed *Subject:* Dispatch script stopped working? [urgent]****
Hey all,****
it seems the dispatcher stopped working about 17-18 hours ago (i.e. Friday 5pm Berlin / 8am SF). Could we please, as fast as possible investigate and get it running again?****
(DNS changes have been suggested as the culprits, I have no clue)****
Cheers,****
Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ****
Do we have a bug ticket for this?
After investigating on toollabs with the copy of the wikidata database there, it looks like change id 55754448 is the odd one:
http://www.wikidata.org/w/index.php?title=Q13360583&diff=57098298&ol...
For that change, DiffChange->getDiff() (and getSiteLinkDiff) are returning a DiffOpChange object. For all other changes, it is an ItemDiff object.
The site link diff involves changing the order of the site links in the array, but otherwise a null change for site links:
Diff\DiffOpChange::__set_state(array( 'newValue' => array ( 'svwiki' => 'Projekt 661 Antjar', 'dewiki' => 'Papa-Klasse', 'frwiki' => 'Classe Papa', 'cswiki' => 'Projekt 661', ), 'oldValue' => array ( 'cswiki' => 'Projekt 661', 'dewiki' => 'Papa-Klasse', 'frwiki' => 'Classe Papa', 'svwiki' => 'Projekt 661 Antjar', ), ))
Cheers, Katie
On Sat, Jul 13, 2013 at 11:53 AM, Denny Vrandečić < denny.vrandecic@wikimedia.de> wrote:
For now it seems running again, let's see if it will finish the backlog and not choke again. We manually bumped up the last change ID by 5000. A full chat log is attached.
This is not a post mortem yet, we need
Outstanding actions:
- try to figure out why the dispatcher choked in the first place
** fix that, if we can figure it out
- try to make the Dispatcher script more robust in such and similar cases
Current effects:
- about 5000 edits were not dispatched, they were skipped
- there is currently a dispatch log of 18 hours. It is expected to go down
quickly, if the dispatcher does not choke again
- Language links and pages using Wikidata data in the Wikipedias might be
outdated by this time
- Current Wikidata editing activity is rather low right now (about 100
edits per minute), so this should help with getting the dispatching back in time
Cheers, Denny
Denny_WMDE
the dispatcher seems to have stopped working yesterday
10:48
Denny_WMDE
i am just looking at the stats, and it is not moving, and seems to has not since yesterday evening
10:48
Denny_WMDE
https://www.wikidata.org/wiki/Special:DispatchStats
10:49
Denny_WMDE
i don't know what got changed, or what happened, but this means that all updates to the wikipedias are currently stuck
10:49
Denny_WMDE
i saw reports on that from the community
10:49
Denny_WMDE
i am not sure how to proceed
10:49
Reedy
I'll look at the logs for it
10:50
Denny_WMDE
thank you
10:50
Reedy
Why is there wikibase-dispatch-changes and wikibase-dispatch-changes2?
10:50
Reedy
10:50
Reedy
They look identicle
10:50
Reedy
identical
10:50
Denny_WMDE
i don't know, sorry. first time i hear of that.
10:50
Denny_WMDE
aude and danielk are both not here right now.
10:51
Reedy
Denny_WMDE: It keeps fatal-ing
10:52
Reedy
Fatal error: Cannot use object of type Diff\DiffOpChange as array in /usr/local/apache/common-local/php-1.22wmf9/extensions/Wikibase/lib/maintenance/dispatchChanges.php on line 858
10:52
Denny_WMDE
what?
10:52
Denny_WMDE
has there been a deployment yesterday?
10:52
Reedy
Not obviously
10:53
Reedy
I did a wikidata folder sync on the 11th
10:53
Denny_WMDE
yes, but the errors started on the 12th?
10:54
Denny_WMDE
i am checking the code at that place
10:54
Reedy
Need to uncompress some log files to check
10:54
Reedy
Denny_WMDE: Yup. Was fine at the start of the 12th, not by the end
10:57
Denny_WMDE
thank you
10:57
Denny_WMDE
so it keeps throwing these fatals right now?
10:58
Reedy
Yup
10:58
Denny_WMDE
hmm, i have an idea about the culprit
10:58
Denny_WMDE
could someone else have made a deploy yesterday? that looks like code that was deployed to stop the aliases of being deleted
10:59
Reedy
https://wikitech.wikimedia.org/wiki/Server_Admin_Log
11:00
Reedy
I can't see a scap/full code tree sync
11:00
Reedy
Started at 171252
11:01
Reedy
Uhh, that's a line number
11:01
Denny_WMDE
??
11:01
Reedy
Hmm
11:02
Reedy
15:09:18 Posted 200 changes to snwiki, up to ID 55754436, timestamp 20130712150916. Lag is 2 seconds. Next ID is 55754436.
11:02
Reedy
Fatal error: Cannot use object of type Diff\DiffOpChange as array in /usr/local/apache/common-local/php-1.22wmf9/extensions/Wikibase/lib/maintenance/dispatchChanges.php on line 858
11:02
Reedy
15:09:19 Posted 200 changes to diqwiki, up to ID 55754366, timestamp 20130712150907. Lag is 12 seconds. Next ID is 55754366.
11:02
Reedy
Then there was a few more working
11:03
Reedy
then it all died
11:03
Reedy
15:10 or so
11:03
Reedy
15:20 Krinkle: Resetting commons:User:CommonsDelinker account e-mailaddress (bug 51016)
11:03
Reedy
10:43 mark: Sending European wikidata and wikivoyage traffic to esams
11:03
Reedy
Neither are relevant
11:03
Denny_WMDE
no
11:03
Denny_WMDE
i am trying to figure out a few things
11:04
Reedy
Denny_WMDE: Ooh. I see a profiling bug
11:09
Denny_WMDE
weird, that diff looks pretty normal
11:10
Denny_WMDE
a profiling bug?
11:10
Reedy
Missing wfProfileOut
11:10
Reedy
I'll fix and commit
11:10
Denny_WMDE
ah it does not profileout
11:10
Denny_WMDE
ok thanks
11:10
Denny_WMDE
but that's not the culprit, right?
11:10
Reedy
I hope not!
11:12
Denny_WMDE
it is this change it is choking on right now http://www.wikidata.org/w/index.php?title=Q11076586&diff=prev&oldid=...
11:12
Denny_WMDE
which does not make any sense
11:13
Denny_WMDE
but we were working yesterday on some changes to the aliases
11:13
Denny_WMDE
that might have changed some diffops, but there was no plan to deploy that afaik
11:13
gerrit-wm
New patchset: Reedy; "Add missing wfProfileOut() call" [mediawiki/extensions/Wikibase] (master) - https://gerrit.wikimedia.org/r/73562
11:13
gerrit-wm
Change merged: Denny Vrandecic; [mediawiki/extensions/Wikibase] (master)
11:14
zz_YuviPanda is now known as YuviPanda
11:20
Denny_WMDE
the involved code has been changed in march resp. in may the last time
11:21
Denny_WMDE
11:21
Reedy
The comment is also seemingly wrong.. * @return \Diff\MapDiff|bool
11:23
Reedy
Bug getting Diff\DiffOpChang
11:24
Reedy
*But
11:24
Denny_WMDE
yes, i am looking into this
11:26
Denny_WMDE
meh, i have a stupid idea which doesn't fix anything probably, so be free to say no — is it possible to simply bump by this change and go to the next? just see if it is choking on this diff for some reason, or if something else is going on
11:32
Denny_WMDE
Reedy: ^
11:33
Denny_WMDE
JeroenDeDauw: ping
11:33
Reedy
Presuambly..
11:33
Reedy
Where does it need removing from or whatever?
11:33
Denny_WMDE
the queue for each wiki needs to be bumped by 1
11:33
Reedy
UPDATE wb_changes_dispatch SET chd_seen = chd_seen +1;
11:35
Reedy
?
11:35
Denny_WMDE
uhmmm… (blush)
11:36
Denny_WMDE
i do not know...
11:36
Reedy
chd_seen INT NOT NULL DEFAULT 0, -- last change ID examined (dispatch state)
11:36
Denny_WMDE
sounds good
11:36
Denny_WMDE
i would try...
11:36
Reedy
Query OK, 287 rows affected (0.00 sec)
11:36
Reedy
Rows matched: 287 Changed: 287 Warnings: 0
11:36
Denny_WMDE
and see if the fatals change accordingly, or if they even dsisappear
11:36
Denny_WMDE
sounds good
11:36
Reedy
Guess we might have to wait upto 5 minutes or so
11:37
Denny_WMDE
ok
11:37
Reedy
well, just over 2 minutes based on the current time
11:37
Denny_WMDE
they are bumped up
11:39
Reedy
09:40:01 Starting loop for unlimited passes or 900 seconds
11:40
Reedy
Fatal error: Cannot use object of type Diff\DiffOpChange as array in /usr/local/apache/common-local/php-1.22wmf9/extensions/Wikibase/lib/maintenance/dispatchChanges.php on line 858
11:40
Denny_WMDE
the last idea is to bump it up by 5000
11:40
Reedy
I can try that
11:40
Reedy
Ahh
11:42
Reedy
Done that, now running it manually..
11:42
Reedy
Over 66000 seconds of lag
11:42
Denny_WMDE
so it runs again?
11:42
Reedy
yup
11:42
Denny_WMDE
ok i am noting down the number range
11:43
Denny_WMDE
of diffs
11:43
Denny_WMDE
we have nailed it down to 5000 diffs, can't be that hard to find the relevant one…
11:43
Reedy
heh
11:44
Denny_WMDE
thank you very much for your help, let's see if it chokes again or if it catches up now
2013/7/13 Sam Reed reedy@wikimedia.org
Denny and I are discussing it on IRC. It seems all the dispatch operations started fatal-ing at around 15:10 yesterday.****
Fatal error: Cannot use object of type Diff\DiffOpChange as array in /usr/local/apache/common-local/php-1.22wmf9/extensions/Wikibase/lib/maintenance/dispatchChanges.php on line 858****
*From:* Denny Vrandečić [mailto:denny.vrandecic@wikimedia.de] *Sent:* 13 July 2013 09:34 *To:* Wikidata-intern; Wikidata tech; Sam Reed *Subject:* Dispatch script stopped working? [urgent]****
Hey all,****
it seems the dispatcher stopped working about 17-18 hours ago (i.e. Friday 5pm Berlin / 8am SF). Could we please, as fast as possible investigate and get it running again?****
(DNS changes have been suggested as the culprits, I have no clue)****
Cheers,****
Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ****
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
wikidata-tech@lists.wikimedia.org