Greetings,
This is the weekly update from the Search Platform team for the week
starting 2019-03-25 and 2019-04-01.
As always, feedback and questions are welcome.
== Discussions ==
=== Search ===
* ElasticSearch upgrade to v6:
** incident [0]
*Trey finished a deep dive into the performance of language
identification for cross-wiki searching [1] (example [2]) and
punctuation-related problems, and discovered things are working pretty
well overall, but the Chinese language model is a bit off.
* Erik noticed that the inlabel / incaption keywords should highlight
the label/caption but were not [3]
* David worked on fixing an error code that Elasticsearch 6
nested_path and nested_filter are deprecated [4] and
_retry_on_conflict was deprecated [5]
* We worked on migrating mjolnir to stdout/syslog/cee logging output [6]
* The team worked on upgrade to elasticsearch 6.5.4 for cirrus / codfw
(specifically) [7] and for eqiad [8]
* Erik worked on the implementation and testing of glent m0
integration with wmf infrastructure [9]
* David did a lot of work to update the mw-config to use the psi&omega
elastic clusters [10]
* David found that the auto_generate_phrase_queries is deprecated and
ineffective [11]
* The team fixed an old bug where we were getting fatal errors -
"cannot perform this operation with arrays" from
CirrusSearch/ElasticaWrite (using JobQueueDB) [12]
* Gehel worked to make spicerack more robust when unfreezing writes to
elasticsearch / cirrus [13] as well as creating a cookbook to reset
frozen write state on elasticsearch / cirrus [14]
* Stas moved WikibaseLexeme search code to WikibaseLexemeCirrusSearch
extension [15]
* We noticed that Elasticsearch indices went read-only, causing a huge lag [16]
* We also saw where search exceptions handling was printing response
information on the screen [17]
* The team fixed an issue where mwgrep was not working [18]
* We also fixed an issue where Elasticsearch 6 needed to silence
deprecation warnings to avoid logspam [19]
* We needed to create an extra elasticsearch clusters in the beta cluster [20]
* We also needed some alerts so we know if mjolnir starts misbehaving [21]
* We also converted check_elasticsearch.py icinga plugin to py3 [22]
* We needed to start using local nginx reverse proxy for connections reuse [23]
* The version of curator that we currently use (5.2.0) isn't
compatible with elasticsearch 6. Which causes issues in a few cron on
logtash servers (see blelow). Version 5.6.0 supports both
elasticsearch 5 and 6.....so...we updated it [24]
* We also did some cleanup of the reprepro configuration for
elasticsearch-curator [25]
* Getting a centralized way to inspect the content of the search
profiles might be helpful when investigating search behaviors. In the
same vein as other dump debug APIs (mapping/settings/cirrusdoc) David
suggested that we should add a new simple API to dump the profiles
(cirrus-profiles-dump) [26]
* David also found that a call to a member function toArray() on a
non-object (null) in
vendor/ruflin/elastica/lib/Elastica/Client.php:736 and fixed it [27]
[0]
https://wikitech.wikimedia.org/wiki/Incident_documentation/20190327-elastic…
report
[1]
https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Review_of_Language_I…
[2]
https://en.wikipedia.org/w/index.php?search=%D0%93%D0%B0%D1%80%D1%80%D0%B8+…
[3]
https://phabricator.wikimedia.org/T217809
[4]
https://phabricator.wikimedia.org/T219266
[5]
https://phabricator.wikimedia.org/T219265
[6]
https://phabricator.wikimedia.org/T218833
[7]
https://phabricator.wikimedia.org/T218878
[8]
https://phabricator.wikimedia.org/T218879
[9]
https://phabricator.wikimedia.org/T218164
[10]
https://phabricator.wikimedia.org/T210381
[11]
https://phabricator.wikimedia.org/T219267
[12]
https://phabricator.wikimedia.org/T124196
[13]
https://phabricator.wikimedia.org/T219640
[14]
https://phabricator.wikimedia.org/T219638
[15]
https://phabricator.wikimedia.org/T216206
[16]
https://phabricator.wikimedia.org/T219364
[17]
https://phabricator.wikimedia.org/T216959
[18]
https://phabricator.wikimedia.org/T219162
[19]
https://phabricator.wikimedia.org/T219269
[20]
https://phabricator.wikimedia.org/T213940
[21]
https://phabricator.wikimedia.org/T214494
[22]
https://phabricator.wikimedia.org/T215439
[23]
https://phabricator.wikimedia.org/T215491
[24]
https://phabricator.wikimedia.org/T218991
[25]
https://phabricator.wikimedia.org/T216235
[26]
https://phabricator.wikimedia.org/T218682
[27]
https://phabricator.wikimedia.org/T217402
----
Subscribe to receive on-wiki (or opt-in email) notifications of the
Discovery weekly update.
https://www.mediawiki.org/wiki/Newsletter:Discovery_Weekly
The archive of all past updates can be found on
MediaWiki.org:
https://www.mediawiki.org/wiki/Discovery/Status_updates
Interested in getting involved? See tasks marked as "Easy" or
"Volunteer needed" in Phabricator.
[1]
https://phabricator.wikimedia.org/maniphest/query/qW51XhCCd8.7/#R
[2]
https://phabricator.wikimedia.org/maniphest/query/5KEPuEJh9TPS/#R
Yours,
Chris Koerner (he/him)
Community Relations Specialist
Wikimedia Foundation