Hello, It's been a busy week in Discoveryland. Here are the updates from the Discovery team for last week. As always, feedback and questions are welcome.
Reminder: There is a new way to follow these weekly updates.You can subscribe to receive on-wiki (or opt-in email) notifications of the Discovery weekly update. Subscribe to be notified!
https://www.mediawiki.org/wiki/Newsletter:Discovery_Weekly
==Highlights==
* The explore similar language links A/B test has been completed and analysis has been done. Unfortunately, we only had one clickthrough to an article written in a different language (which was displayed in the new language links) as the report documents. We will not be going forward with this feature. [0] ** However, if a user wants to have the language link script added to their logged-in account, please follow these instructions. [1] ** The full explore similar script (displays related articles, categories and language links) can also be enabled for logged-in users, see the instructions. [2] * This latest A/B test (as noted directly above) effectively closes out the additional features that the Discovery Department were exploring to possibly add to the search engine results page (SERP); additional details can be read online [3]; overall A/B testing details can be found on mw.org [4] and self-guided testing instructions as well. [5]
==Discussions==
=== Search === * After successfully testing and deploying the machine learning to rank model on English Wikipedia [6], we have deployed a new test out to 18 other wikis that have >1% of traffic this week. [7] * For the relevance survey, Erik developed backend infrastructure to support lots of queries and lots of results per query [8] and the third running of the test was turned off this week [9], analysis will be detailed in [10] * The Chinese wiki was re-indexed [11], allowing multi-hyphen tokens to be enabled in production [12] * The Hebrew language wikis were also re-indexed [13] and the HebMorph plugin was also deployed [14] * We updated Vagrant to include new language plugins (Polish, Ukrainian, Chinese and Hebrew) [15] * After some exhaustive investigation, we've resolved the recent load spikes on the elasticsearch cluster in eqiad [16] * Jan has nearly finished the first Selenium test re-written from Ruby to Node.js and has learned a lot in the process. This first test will help to pave the way forward for the rest of the tests that will need to be re-written [17] [18] * We've completed testing for adding support of interleaved search results [19] and currently wrapping up the analysis of the test [20] * Fixed an issue with using mixed versions of the ltr plugin being deployed on elastic1020 [21] * Erik created a few bash scripts to send from terbium when reindexing the default namespaces [xx] (which were moved from general to content indices); this will go into effect when we reindex the wikis again [22] * The second running of the explore similar A/B test for language links was completed on Thursday [23] and analysis is complete [24]; the report can be read online. [25]
=== Analysis === * Chelsy finalized her work of creating a (mostly) automated and parameterized report template for the Search Platform teams's A/B tests [26] * Chelsy also completed some additional API usage break out (internal vs external) on the metrics dashboard [27] [28] * Chelsy also finalized a new method to keep data longer (that isn't in a dashboard) by adding reports into golden (/srv/published-datasets/discovery) [29] * Mikhail created a dashboard to track the prevalence of sister project search results on fulltext search result pages on desktop, broken up by language. For example, it turns out that nearly 80% of fulltext searches show sister projects on enwiki. [30]
=== Portal === * Jan has been working on updating the Wikipedia portal, to adjust the languages used for Chinese translations [31]
=== Maps === * Gehel cleared up some vm space on Horizon by deleting 4 unused maps-team instances [32] * The map service has been upgraded to Node.js 6.11 [33] * Map traffic has been enabled for active / active service (serving map tiles from both data centers) [34]
[0] https://analytics.wikimedia.org/datasets/discovery/reports/Explore_Similar_L... [1] https://www.mediawiki.org/wiki/Cross-wiki_Search_Result_Improvements/explore... [2] https://www.mediawiki.org/wiki/Cross-wiki_Search_Result_Improvements/self-gu... [3] https://www.mediawiki.org/wiki/Cross-wiki_Search_Result_Improvements [4] https://www.mediawiki.org/wiki/Cross-wiki_Search_Result_Improvements/Testing [5] https://www.mediawiki.org/wiki/Cross-wiki_Search_Result_Improvements/self-gu... [6] https://phabricator.wikimedia.org/T175772 [7] https://phabricator.wikimedia.org/T175771 [8] https://phabricator.wikimedia.org/T174387 [9] https://phabricator.wikimedia.org/T175047 [10] https://phabricator.wikimedia.org/T174106 [11] https://phabricator.wikimedia.org/T173464 [12] https://phabricator.wikimedia.org/T172653 [13] https://phabricator.wikimedia.org/T167058 [14] https://phabricator.wikimedia.org/T167057 [15] https://phabricator.wikimedia.org/T164367 [16] https://phabricator.wikimedia.org/T169498 [17] https://gerrit.wikimedia.org/r/#/c/378688/ [18] https://phabricator.wikimedia.org/T174103 [19] https://phabricator.wikimedia.org/T150032 [20] https://phabricator.wikimedia.org/T171215 [21] https://phabricator.wikimedia.org/T175951 [22] https://phabricator.wikimedia.org/T176397 [23] https://phabricator.wikimedia.org/T175649 [24] https://phabricator.wikimedia.org/T175650 [25] https://analytics.wikimedia.org/datasets/discovery/reports/Explore_Similar_L... [26] https://phabricator.wikimedia.org/T131795 [27] http://discovery.wmflabs.org/metrics/#referer_breakdown [28] https://phabricator.wikimedia.org/T172452 [29] https://phabricator.wikimedia.org/T172453 [30] https://discovery.wmflabs.org/metrics/#sister_search_prevalence [31] https://phabricator.wikimedia.org/T171647 [32] https://phabricator.wikimedia.org/T175998 [33] https://phabricator.wikimedia.org/T171707 [34] https://phabricator.wikimedia.org/T162362
----
The archive of all past updates can be found on MediaWiki.org:
https://www.mediawiki.org/wiki/Discovery/Status_updates
Interested in getting involved? See tasks marked as "Easy" or "Volunteer needed" in Phabricator.
[1] https://phabricator.wikimedia.org/maniphest/query/qW51XhCCd8.7/#R [2] https://phabricator.wikimedia.org/maniphest/query/5KEPuEJh9TPS/#R
Yours, Chris Koerner Community Liaison Wikimedia Foundation
Chris Koerner, 25/09/2017 23:32:
- Mikhail created a dashboard to track the prevalence of sister
project search results on fulltext search result pages on desktop, broken up by language. For example, it turns out that nearly 80% of fulltext searches show sister projects on enwiki. [30] [30] https://discovery.wmflabs.org/metrics/#sister_search_prevalence
Interesting. There's probably some underlying pattern to analyse that would tell us something about the relative development of various Wikimedia projects in several languages.
Nemo