Hello!
I'll be in the "Writing a Self-Review" workshop today. You said so
much good about it that I can't miss it. It is in conflict with our
standup so here is my status:
* codfw elasticsearch cluster has been behaving erratically yesterday.
This is probably related to a copy/paste error (by myself) in the
unicast configuration (I mixed up eqiad and codfw). It does not
completely explain why we started seeing issues only after the change
was deployed to all but one server for > 12 hours. In the end, a full
cluster restart fixed the issue, but we still don't really understand
it.
* started the restart of eqiad elasticsearch cluster (please all cross
your fingers!)
* WDQS data reload: in progress. I did not check properly that the
latest version was deployed before starting the data import, so we
actually loaded it with the wrong version (so geo indexing not yet
enabled). Another data reload will be required (already in progress on
wdqs1002, will do it afterward on wdqs1001).
* Spent quite some time going through all the phabricator issues I am
subscribed to and gerrit changes, did some cleaning.
* new elasticsearch servers are almost ready to be installed, I spent
some time understanding how installing them actually works. Will
probably start that tonight or tomorrow.
--
Guillaume Lederrey
Operations Engineer, Discovery
Wikimedia Foundation