Hello, This is the weekly update from the Search Platform team for the week starting 2019-08-26 until 2019-09-30.
This is the final weekly update from the team. Since starting in March of 2016 we have published over 135 issues of this newsletter. Thank you for reading.
Work continues however, and interested community members can follow the progress of the Search Platform team through the Scrum of Scrums weekly notes. [0]
== Discussions ==
=== Search === * There was an older bug where unpredictable behavior with the order of Special:Search parameters was occurring - we had worked on it previously but David added a new patch to add morelikethis a non-greedy version of the morelike keyword and deployed it this week on the train [1] * David and Tgr did some work on fixing where vagrant wikibase cirrus role was not working and had updated Cirrus to index P1 and P2 as statements [2] * Cloudelastic jvms were suffering from weird behaviors of the GC causing slowdowns of the whole cluster and therefor slowing consumption of production MW JobQueues; it needed some alerts that Mathew and Gehel added in [3] * David discovered that create_timestamp was not present on production index mappings for some wikis and fixed it [4] * Several folks worked on an issue where the elasticsearch systemd unit sets PrivateTmp=true, but it preventing jstack / jmap / etc... from connecting to the JVM [5] * There was a review of the logs and discovered that Elasticsearch OOM errors in MW vagrant....fixed by increasing Xmx to 512m [6] * Tgr found a bug where CirrusSearch on Vagrant throws "mapper_parsing_exception: analyzer [aa_plain] not found for field [plain]" on provision and David fixed it by adding a patch to always enable WBCS [7] * We needed to normalize deepcat inputs, as it was found that deepcat was case sensitive on first letter of category name [8] * Icinga reports read time out error for some checks on cloudelastic cluster, so with some team conversation, we added the option separator for elastic shard size alerts [9] * David found an issue where EventBusMonologHandler was malforming UTF-8 characters, because they were possibly incorrectly encoded, resulting in send aborted (and now fixed by normalizing the request param name) [10] * The team did several patches to adjust mjolnir bulk_daemon to import glent swift uploads as desired [11] * We found many memory correctable errors -EDAC- elastic1029 that needed reviewing...the original issue seems to have gone away, but will need more help / work from SRE to get the server working properly (new ticket will be created) [12] * Stas and Igor worked on an error where ConcurrentModificationException is on a non-grouping query with aggregates in SELECT [ * There was a request to update Blazegraph where a normalized exception was happening with a particular query; Stas and Igor collaborated on it, adding support uncertainVars in ServiceNode and fixing NME on bind variable both by LabelService and some other clause [13] [14] [15] * There was also a query that found HAVING in named subquery results in “non-aggregate variable in select expression” error, Igor and Stas did more collaboration to fix it [16] * More Blazegraph fixes: SELECT * on query with no variables and property path results in NotMaterializedException [17] and UnsupportedOperationException on property path in EXISTS [18] * A bug was discovered in the search results page where the Commons images weren't showing up anymore (on all wiki's other than enwiki); David found the issue and fixed it [19] [20] * The Discernatron tool for labeling Wikipedia search results for relevance testing used to be available but started getting a '502' error, Erik restarted the container and it's working again [21] * David worked on making sure search engines can control extract interfaces and base classes from SearchResultSet and SearchResult [22] * As part of our support for the Structured Data on Commons work...hascaption (including hascaption:*) currently returns all files that ever had a caption, even if that caption has been removed via reversion or edit and this needs to be changed so that when the indexing occurs (and data is removed), the hascaption/inlabel/incaption reflects those changes [23] * David worked on adding a debugging API to dump the explanation of the completion suggester scores [24] * David also added support for OR in the hastemplate keyword using | (pipe) [25] * The team worked on (and finished) migrating WDQS to new logging pipeline [26] * A bug was filed where subpageof will sometimes display results which are not subpages of the page that we limited the search to (it should indicate that is matched against a redirect) [27]
[0] https://www.mediawiki.org/wiki/Scrum_of_scrums [1] https://phabricator.wikimedia.org/T159321 [2] https://phabricator.wikimedia.org/T228503 [3] https://phabricator.wikimedia.org/T231516 [4] https://phabricator.wikimedia.org/T230990 [5] https://phabricator.wikimedia.org/T230774 [6] https://phabricator.wikimedia.org/T211362 [7] https://phabricator.wikimedia.org/T230018 [8] https://phabricator.wikimedia.org/T228633 [9] https://phabricator.wikimedia.org/T230366 [10] https://phabricator.wikimedia.org/T228496 [11] https://phabricator.wikimedia.org/T227364 [12] https://phabricator.wikimedia.org/T214283 [13] https://phabricator.wikimedia.org/T159723 [14] https://phabricator.wikimedia.org/T170704 [15] https://phabricator.wikimedia.org/T168876 [16] https://phabricator.wikimedia.org/T165559 [17] https://phabricator.wikimedia.org/T168741 [18] https://phabricator.wikimedia.org/T173243 [19] https://phabricator.wikimedia.org/T232032 [20] https://www.mediawiki.org/wiki/Topic:V6dtxvwtk9nchcbx [21] https://phabricator.wikimedia.org/T231980 [22] https://phabricator.wikimedia.org/T228626 [23] https://phabricator.wikimedia.org/T231038 [24] https://phabricator.wikimedia.org/T230919 [25] https://phabricator.wikimedia.org/T232078 [26] https://phabricator.wikimedia.org/T232184 [27] https://phabricator.wikimedia.org/T187548
----
The archive of all past updates can be found on MediaWiki.org:
https://www.mediawiki.org/wiki/Discovery/Status_updates
Interested in getting involved? See tasks marked as "Easy" or "Volunteer needed" in Phabricator.
[1] https://phabricator.wikimedia.org/maniphest/query/qW51XhCCd8.7/#R [2] https://phabricator.wikimedia.org/maniphest/query/5KEPuEJh9TPS/#R
Yours, Chris Koerner (he/him) Community Relations Specialist Wikimedia Foundation