Discovery October 2016

discovery@lists.wikimedia.org

10 participants
16 discussions

How to measure disagreement between human judges in discernatron?
by Erik Bernhardson 02 Nov '16

02 Nov '16

For a little backstory, in discernatron multiple judges provide scores in from 0 to 3 for results. Typically we only request a single query to be reviewed by two judges. We would like to measure the level of disagreement between these two judges, and if it crosses some threshold get two more scores, so we can then measure disagreement in the group of 4. Somehow though, we need to define how to measure that level of disagreement and what the threshold for needing more scores is. Some specialized concerns: * It is probably important to include not just that the users gave different values, but also how far apart they are. The difference between a 3 and a 2 is much smaller than between a 2 and a 0. * If the users agree that 80% of the results are all 0, but disagree on the last 20%, even though the average disagreement is low it's probably still important? Might be worthwhile to take all the agreements about irrelevant results and remove them before calculating disagreement? Not sure... I know we have a few math nerds here on the list, so hoping someone has a few ideas.

4 8

Re: [discovery] [Engineering] Scheduled reboot of terbium on Thursday
by Guillaume Lederrey 31 Oct '16

31 Oct '16

The title suggest indices cron were disabled in preparation for this maintenance. No re-indexing happened this night. Since we plan the reboot for tomorrow, it does not make sense to re-enable them right now. We will run on slightly outdated indices until Wednesday. If the reboot is delayed again tomorrow, I'll manually run the indexing. If all goes to plan, I'll just re-enable the cron jobs. On Mon, Oct 31, 2016 at 8:46 AM, Moritz Mühlenhoff <mmuhlenhoff(a)wikimedia.org> wrote: >> I'll make a new attempt on Monday (31st) at around 7am UTC. If there's something >> which is safe to abort and which you know will still be running, please send >> me an email. > > Due to a bigger reindexing job which didn't complete in time, this > will be moved to tomorrow > 1st Nov at 7am UTC. > > Cheers, > Moritz > > _______________________________________________ > Engineering mailing list > Engineering(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/engineering -- Guillaume Lederrey Operations Engineer, Discovery Wikimedia Foundation UTC+2 / CEST

1 0

Discovery Weekly Update for the week starting 2016-10-24
by Chris Koerner 29 Oct '16

29 Oct '16

Hello, Here is the Discovery status update for the week starting 24 October. Feedback and questions are welcome. == Discussions == * Show globe icon 🌐 next to all <maplinks> links [0] * Request for Comment - All map location names should be shown in the user's language [1] * Merge Kartographer with commons-on-osm [2] * Discussion and recommendations for the overall usability of the new search engine results page with the Design team [3] == Search == * Don't show the "results from other projects" box if empty [4] * Check to see if popularity_score is wrong in a lot of articles [5] * Disable highlighting on insource queries if experimental highlighter in not turned on [6] * Reindex search cluster for BM25 test [7] * Search works incorrectly when the query contains words used as namespace names and a colon (:) [8] * ask for translations for 'showing results from' (Polish, Dutch, Arabic and Chinese) [9] * Configure varnish to include wdqs nodes in codfw [10] * Started second BM25 A/B test (ja, zh, th) [11] * Turn on second BM25 test [12] * Fix in_array() expects parameter 2 to be an array or collection in /srv/mediawiki/php-1.28.0-wmf.22/extensions/CirrusSearch/includes/Search/RescoreBuilders.php on line 160 [13] * Communicate to community the upcoming BM25 release [14] * [Recurring task] CirrusSearch: what is updated during re-indexing [15] * Review and fix the description of autocomplete profiles (completion suggester) in the Search preference tab [16] * Enable cross-wiki backend search functionality on it.wikipedia - we made it faster [17] * Enable GC (garbage collection) logs on Elasticsearch JVM [18] * Argument 2 passed to SearchEngine::processCompletionResults() must be an instance of SearchSuggestionSet, Elastica\Response in SearchEngine class [19] * [bug] cirrussearch fatal in 1.28.0-wmf.23 - [20] == Analysis == * Compare ZRR for query features across other search engines [21] * Investigate what we'd need to do to ignore double quotes in search queries [22] * Maps Dashboard: fix and update [23] * Analyze the variance of user-agent's, country, and other useful metrics of google referred traffic with and without a search query available in referrer [24] * Add a PaulScore approximation to discovery.wmflabs.org [25] == Portal == '''''Portal updates that got code reviews after rebasing - thanks! ''''' * update logo for new branding guidelines [26] * add information for Wikipedia / Wikimedia apps availability [27] == Interactive == * Deprecate <slippymap> provided by MapSources in favor of Kartographer's <mapframe> [28] * Illegal and unheralded removal of MapSourcesHooks::wfgeoLink function hook on dewikivoyage [29] * More details button should appear when map is loaded [30] * maps1002 does not run latest kartotherian [31] * Snapshot service not working with empty groups param [32] * Kartographer no longer displays external data credits [33] * Graph behaves differently on PHP7 - was fixed automagically [34] * <mapframe>/<maplink>: Invalid GeoJSON code in wgKartographerLiveData prevents drawing of maps at Wikivoyage [35] * Don't pollute history with map hashes [36] * Fix Governor's demo - party affiliations (needed a fix with WDQS SPARQL query - thanks, Stas!) [37] * Show wikidata query results as map markers [38] * Support geomask externaldata in Kartotherian [39] * Geoshape service should return map mask polygon with holes [40] '''''Interactive tickets completed this week, but awaiting next train in order to be deployed: Nov 1, 2016:''''' * Use maps snapshot service until user interacts (click/mouseover?) [41] * Geoshape service should return map mask polygon with holes [42] * Show preview with static maps is broken [43] * [Bug] GeoData\Searcher fatal error in 1.28-wmf.23 [44] * Support data.get() with optional language param [45] * Remove static background from mapframes after going dynamic [46] == Interactive Ops Tasks == * WV layers dropdown bug on vem.wmflabs.org [47] * Maps - move traffic to eqiad instead of codfw [48] * Configure new maps servers in eqiad [49] * Unmet dependencies around postgis apt packages on maps* servers [50] * increase replication factor for system_auth keyspace on maps / cassandra [51] * Upgrade PostGIS on maps servers to version 2.2+ [52] * Maps-test was created with incorrect initial encoding [53] * Update Node on Maps to v4.6.0 [54] * elastic2020 is powered off and does not want to restart [55] * Maps - error when doing initial tiles generation: "Error: could not create converter for SQL_ASCII"" [56] [0] https://phabricator.wikimedia.org/T145176 [1] https://phabricator.wikimedia.org/T112948 [2] https://phabricator.wikimedia.org/T149280 [3] https://phabricator.wikimedia.org/T139310#2748175 [4] https://phabricator.wikimedia.org/T96881 [5] https://phabricator.wikimedia.org/T148136 [6] https://phabricator.wikimedia.org/T148107 [7] https://phabricator.wikimedia.org/T147498 [8] https://phabricator.wikimedia.org/T148344 [9] https://phabricator.wikimedia.org/T143354 [10] https://phabricator.wikimedia.org/T146158 [11] https://phabricator.wikimedia.org/T147495 [12] https://phabricator.wikimedia.org/T147497 [13] https://phabricator.wikimedia.org/T148840 [14] https://phabricator.wikimedia.org/T147509 [15] https://phabricator.wikimedia.org/T147505: [16] https://phabricator.wikimedia.org/T148010 [17] https://phabricator.wikimedia.org/T146179 [18] https://phabricator.wikimedia.org/T134853 [19] https://phabricator.wikimedia.org/T148978 [20] https://phabricator.wikimedia.org/T149254 [21] https://phabricator.wikimedia.org/T136377 [22] https://phabricator.wikimedia.org/T149143 [23] https://phabricator.wikimedia.org/T149127 [24] https://phabricator.wikimedia.org/T128146 [25] https://phabricator.wikimedia.org/T144424 [26] https://phabricator.wikimedia.org/T144834 [27] https://phabricator.wikimedia.org/T137495 [28] https://phabricator.wikimedia.org/T136768 [29] https://phabricator.wikimedia.org/T149288 [30] https://phabricator.wikimedia.org/T148882 [31] https://phabricator.wikimedia.org/T148881 [32] https://phabricator.wikimedia.org/T149145 [33] https://phabricator.wikimedia.org/T149154 [34] https://phabricator.wikimedia.org/T148250 [35] https://phabricator.wikimedia.org/T148971 [36] https://phabricator.wikimedia.org/T144880 [37] https://phabricator.wikimedia.org/T149281 [38] https://phabricator.wikimedia.org/T134816 [39] https://phabricator.wikimedia.org/T149151 [40] https://phabricator.wikimedia.org/T146582 [41] https://phabricator.wikimedia.org/T148070 [42] https://phabricator.wikimedia.org/T146582 [43] https://phabricator.wikimedia.org/T149070 [44] https://phabricator.wikimedia.org/T149159 [45] https://phabricator.wikimedia.org/T149149 [46] https://phabricator.wikimedia.org/T149155 [47] https://phabricator.wikimedia.org/T148865 [48] https://phabricator.wikimedia.org/T145758 [49] https://phabricator.wikimedia.org/T138092 [50] https://phabricator.wikimedia.org/T147780 [51] https://phabricator.wikimedia.org/T149074: [52] https://phabricator.wikimedia.org/T144763 [53] https://phabricator.wikimedia.org/T148114 [54] https://phabricator.wikimedia.org/T148661 [55] https://phabricator.wikimedia.org/T149006 [56] https://phabricator.wikimedia.org/T148031 ---- The full update, and archive of past updates, can be found on MediaWiki.org: https://www.mediawiki.org/wiki/Discovery/Status_updates Interested in getting involved? See tasks marked as Easy or Volunteer needed in Phabricator. [1] https://phabricator.wikimedia.org/maniphest/query/qW51XhCCd8.7/#R [2] https://phabricator.wikimedia.org/maniphest/query/5KEPuEJh9TPS/#R Yours, Chris Koerner Community Liaison - Discovery Wikimedia Foundation 57 links‽ That has to be some kind of record.

1 0

StrepHit & Wikidata Primary Sources tool: call for partnership
by Marco Fossati 27 Oct '16

27 Oct '16

Hi everyone, This is Marco, the project leader of StrepHit: https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Va… StrepHit is a Web agent that reads authoritative sources and feeds Wikidata with references, thus improving the data quality of Wikidata. The first 6 months of work have been funded through an Individual Engagement Grant and have led to the release of the first version. Its datasets are now in the Wikidata *primary sources tool*: https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_… I am contacting you since I believe you are a strategic partner that could contribute to the radical uplift of the tool: https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Va… If you have any thoughts for improvement, feel free to comment on the renewal proposal via its talk page. If you agree that the primary sources tool requires a boost, please consider an endorsement of the StrepHit renewal! https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Va… Best, Marco

1 0

Latest search updates
by Deborah Tankersley 26 Oct '16

26 Oct '16

After extensive testing over the last several months using a new search query scoring method called BM25 (Best Matching) [1], we recently completed a limited production release to the following top languages: English, German, Spanish, Russian, Portuguese, French, Italian, Polish, Dutch and Arabic. This new release is replacing the older search method called tf-idf (term frequency-inverse document frequency) [2]. We have additional testing to do [3,4] to figure out if BM25 will work in languages that don’t use spaces in-between their words , i.e.: Japanese, Chinese, etc. The Discovery team announces much of our completed work in weekly status updates [5 , 6 ], but some of the work isn’t actually obvious to anyone who uses our search engine - t hat is because it isn’t actually ‘live’ until a complete re-index of the servers occur. We’ve created a recurring ticket in Phabricator [ 7 ] to keep track of the work that goes live in production after a re-index, such as the one we’ve also just completed. A few highlights of the recent re-index are implementing ascii-folding for the French language and fixing several bugs for French ÿ, and Russian ’Е’ and 'Ё' when those characters are entered in a search query. Cheers from the Discovery Search Team! [1] https://en.wikipedia.org/wiki/Okapi_BM25 [2] https://en.wikipedia.org/wiki/Tf%E2%80%93idf [3] https://phabricator.wikimedia.org/T147495 [4] https://phabricator.wikimedia.org/T147501 [5] https://www.mediawiki.org/wiki/Wikimedia_Discovery#Updates [ 6 ] https://www.mediawiki.org/wiki/Discovery/Status_updates [ 7 ] https://phabricator.wikimedia.org/T147505 -- deb tankersley Product Manager, Discovery irc: debt Wikimedia Foundation

1 0

Re: [discovery] [Translators-l] Translations requested: "showing results from"
by Deborah Tankersley 25 Oct '16

25 Oct '16

Fantastic, thanks so much! :) -- deb tankersley Product Manager, Discovery irc: debt Wikimedia Foundation On Sun, Oct 23, 2016 at 11:56 AM, Romaine Wiki <romaine.wiki(a)gmail.com> wrote: > Hi Deborah, > > I have now translated all the (missing) phrases for Dutch. > > Romaine > > 2016-09-16 23:46 GMT+02:00 Deborah Tankersley <dtankersley(a)wikimedia.org>: > >> Hi, >> >> The Discovery team has finished up another language in our quest to >> enable language detection for better search results. We now need your help >> in translating the phrase "*showing results from" *in Dutch for these >> languages: English, Chinese, Arabic, Korean, Greek, Hebrew, Japanese, >> and Russian. >> >> It would be great if we can get these new translations into >> translatewiki using the following message keys (and this message group >> link): https://translatewiki.net/wiki/Special:Translate?grou >> p=ext-wikimediainterwikisearchresults: >> >> Dutch (Nederlands) >> <https://translatewiki.net/w/i.php?title=Special:Translate&group=ext-wikimed…> >> [1]: >> >> search-interwiki-results-enwiki >> search-interwiki-results-ruwiki >> search-interwiki-results-hewiki >> search-interwiki-results-jawiki >> search-interwiki-results-arwiki >> search-interwiki-results-zhwiki >> search-interwiki-results-kowiki >> search-interwiki-results-elwiki >> >> >> Cheers and thanks for your time! >> >> >> [1] https://translatewiki.net/w/i.php?title=Special:Translat >> e&group=ext-wikimediainterwikisearchresults&language=nl& >> filter=&action=translate >> >

1 0

Incident report - maps October 21st
by Guillaume Lederrey 25 Oct '16

25 Oct '16

Hello! I just finished writing a short incident report on the maps issue of last Friday [1]. Basically, I was stupid and reinitialized Cassandra on the wrong node. This should have had almost no effect, except that the system_auth keyspace of Cassandra is configured with a replication factor of 1. Loosing that node means that we also lost authentication. Thanks a lot to Brandon for the help in mitigating this! [1] https://wikitech.wikimedia.org/wiki/Incident_documentation/20161021-Maps -- Guillaume Lederrey Operations Engineer, Discovery Wikimedia Foundation UTC+2 / CEST

1 0

Discovery Weekly Update for the week starting 2016-10-17
by Chris Koerner 22 Oct '16

22 Oct '16

Hello, Here is the Discovery status update for the week starting 17 October. Feedback and questions are welcome. == Discussions == * Added a Request for Comment (RfC) on refactoring Kartotherian to use an established plugin system [0] * Added more information in order to estimate the hardware requirements for ordering new servers for Elasticsearch [1] * Had a discussion about needing variant options for Chinese in Language Settings [2] * Discussed the preliminary results on running Paulscore with BM25 on zh, ja, th [3] * Started the discussion on the roll out plan/path for Maps [4] * Added a Request for Comment (RfC) on <maplink> support for link text vs fullscreen caption text [5] === Search === * Enable sub-phrase completion suggester on wikitech, mediawiki.org and wikisource [6] * Enable document versioning support via wikimedia-extra extension in production [7] * Remove custom analysis chains from vagrant [8] * Map modifier letter apostrophes to straight or curly quotes in the French Elasticsearch analysis chain [9] Search tickets completed this week, but awaiting next train in order to be deployed: Oct 25, 2016: * Put SITENAME in search box placeholder "searchsuggest-search" like in MobileFrontend's "mobile-frontend-placeholder [10] * Integrate "did you mean" data collection into search satisfaction schema [11] * Improve processing of the apostrophe by the search engine in Ukrainian [12] * Enable Latvian and Lithuanian analyzers [13] * CirrusSearch SQL query for locating pages for reindex performs poorly [14] * MySQL chooses poor query plan for link counting query [15] * When showing search results for different language, exclude images from Commons [16] === Analysis === Analysis tickets completed this week, but awaiting next train in order to be deployed: Oct 25, 2016: * Add a PaulScore approximation to discovery.wmflabs.org (waiting on phab:T138087 to be deployed on the train) [17] [18] === Portal === Portal updates that need code review - PLEASE HELP: * update logo for new branding guidelines [19] * add information for Wikipedia / Wikimedia apps availability [20] === Interactive === * Implement static kartotherian service with geojson layer [21] * Document how to use maps directly from JavaScript [22] * Maplinks external groups are applied multiples times [23] * Perpetration for depreciating <slippymap> is done thanks to RolandUnger, who handled updating most of the templates [24] Interactive tickets completed this week, but awaiting next train in order to be deployed: Oct 25, 2016: * Enable double-click zooming in full screen mode [25] * Coordinates do not fit in the detail box on one line [26] * Wikivoyage groups incorrectly show external layers [27] * Opening VisualEditor on en.wp always loads Mapbox [28] * Maplink breaks when showing undefined groups [29] == Other Noteworthy Stuff == * Decommissioned deployment-elastic08.deployment-prep.eqiad.wmflabs [30] [0] https://phabricator.wikimedia.org/T148605 [1] https://phabricator.wikimedia.org/T148559 [2] https://phabricator.wikimedia.org/T134967 [3] https://phabricator.wikimedia.org/T147501 [4] https://phabricator.wikimedia.org/T148666 [5] https://phabricator.wikimedia.org/T148706 [6] https://phabricator.wikimedia.org/T146208 [7] https://phabricator.wikimedia.org/T146210 [8] https://phabricator.wikimedia.org/T147502 [9] https://phabricator.wikimedia.org/T146804 [10] https://phabricator.wikimedia.org/T144640 [11] https://phabricator.wikimedia.org/T138087 [12] https://phabricator.wikimedia.org/T146358 [13] https://phabricator.wikimedia.org/T148052 [14] https://phabricator.wikimedia.org/T147957 [15] https://phabricator.wikimedia.org/T143932 [16] https://phabricator.wikimedia.org/T145628 [17] https://phabricator.wikimedia.org/T144424 [18] https://phabricator.wikimedia.org/T138087 [19] https://phabricator.wikimedia.org/T144834 [20] https://phabricator.wikimedia.org/T137495 [21] https://phabricator.wikimedia.org/T124134 [22] https://phabricator.wikimedia.org/T138054 [23] https://phabricator.wikimedia.org/T148687 [24] https://phabricator.wikimedia.org/T136768 [25] https://phabricator.wikimedia.org/T148671 [26] https://phabricator.wikimedia.org/T147943 [27] https://phabricator.wikimedia.org/T148346 [28] https://phabricator.wikimedia.org/T148511 [29] https://phabricator.wikimedia.org/T148810 [30] https://phabricator.wikimedia.org/T147777 ---- The full update, and archive of past updates, can be found on MediaWiki.org: https://www.mediawiki.org/wiki/Discovery/Status_updates Interested in getting involved? See tasks marked as Easy or Volunteer needed in Phabricator. [1] https://phabricator.wikimedia.org/maniphest/query/qW51XhCCd8.7/#R [2] https://phabricator.wikimedia.org/maniphest/query/5KEPuEJh9TPS/#R Yours, Chris Koerner Community Liaison - Discovery Wikimedia Foundation

1 0

Discovery Weekly Update for the week starting 2016-10-10
by Chris Koerner 15 Oct '16

15 Oct '16

Greetings, This week is chock full of updates from Discovery. As always, feedback and questions are welcome. = Discussions = == Search == * Updated the search results page because the content overflows page boundaries [1] * Add accent squashing to Russian/Cyrillic analyser (with tests) [2] * Fixed Cyrillic 'Е' and 'Ё' equivalence not found by search [3] * Decided on the outcome of the recent BM25 A/B test, what our next steps on using BM25 will be [4] * Fixed issue where the search page content overflows the page boundaries on various browsers [5] * Updated where users prefer pages in their language in multilingual wikis [6] Search tickets completed this week, but awaiting next train in order to be deployed: Oct 25, 2016: * Put SITENAME in search box placeholder "searchsuggest-search" like in MobileFrontend's "mobile-fronted-placeholder [7] * Integrate "did you mean" data collection into search satisfaction schema [8] == Analysis == * Completed analysis of the results of BM25 AB test (final report) [9] [10] * Search results page: how many visitors are on mobile vs desktop [11] - Mikhail ran a hive query for the last couple of weeks and discovered: * Analyze the variance of user-agent's, country, and other useful metrics of google referred traffic with and without a search query available in referrer [12] ** this was to analyze what the difference is between the requests that have a search term in the query and those that do not, to determine if those queries could be used as a representative sample. Turns out, they're not relevant at all. * Investigate recent spike in pageviews on wikipedia.org portal page [13] ** there has been a dramatic increase in the amount of page views since middle of Sep 2016 to the portal--it turns out that this increase is due to various issues: school starting up again, increasing bot traffic, and possibly several agencies/companies that might have loaded 'golden' copies of software onto their employees phones that have wikipedia.org as the default home page of their Android phone browsers. == Portal == * Updated articles by languages [14] * Centered 'the free encyclopedia' phrase on mobile devices [15] * Fixed the language link that broke unevenly [16] * Updated the portal text to use the new, approved shade of blue (partially completed) [17] * Fixed where `.lang-list-button` doesn't feature focus state [18] Portal updates that need code review: * update logo for new branding guidelines [19] * add information for Wikipedia / Wikimedia apps availability [20] == Interactive == * Maps with geoshapes now have a link to Wikidata item or query in the bottom attribution text (i.e.: externalData should show up as data source in credits) [21] * Migrated all map usage from "href" to "service" external data (about 50 pages) [22] * Simplified (and normalized) ExternalData parsing in Kartographer [23] * Geoshapes service URL is hardcoded [24] * Refactor data loader into a separate lib, as both Kartotherian (snapshot) and Kartographer need identical functionality, for a given map, we need to provide a fully expanded (no externalData) GeoJSON [25] * Fixed where failed ExternalData requests result in all maps on page being completely broken [26] * Set "maxzoom" to 14 on osm.tm2source, based on community feedback [27] * Defined KPIs and other interesting maps statistics about how we would like to gather to track and improve maps [28] * Ensure Maps servers can be installed easily (automation + documentation) [29] * maps-test* hosts running low on space [30] * Updated to postgresql 9.4.9 on all maps servers [31] * Created a pipeline for supplying data to static map service (Here is a sample URL) [32] [33] * Error "Maximum zoom is 18" (updated logging to be an 'info' message rather than an error) [34] * Fixed where the Sparql-based externalData maps are broken [35] Interactive tickets completed this week, but awaiting next train in order to be deployed: Oct 25, 2016: * <mapframe> maximize button sometimes doesn't work [36] * buildHtml() should parse messages in parser output language [37] = Other Noteworthy Stuff = * The Search team created a new ticket to note what will be 'live' in production during the next full re-indexing: [Recurring task] CirrusSearch: what is updated during re-indexing [38] * Request for translation sent for Update CirrusSearch documentation [39] [1] https://phabricator.wikimedia.org/T138108 [2] https://phabricator.wikimedia.org/T102298 [3] https://phabricator.wikimedia.org/T124592 [4] https://phabricator.wikimedia.org/T147008 [5] https://phabricator.wikimedia.org/T138108 [6] https://phabricator.wikimedia.org/T68829 [7] https://phabricator.wikimedia.org/T144640 [8] https://phabricator.wikimedia.org/T138087 [9] https://phabricator.wikimedia.org/T143589 [10] https://wikimedia-research.github.io/Discovery-Search-Test-BM25/ [11] https://phabricator.wikimedia.org/T147513 [12] https://phabricator.wikimedia.org/T128146 [13] https://phabricator.wikimedia.org/T146214 [14] https://phabricator.wikimedia.org/T128546 [15] https://phabricator.wikimedia.org/T143241 [16] https://phabricator.wikimedia.org/T144007 [17] https://phabricator.wikimedia.org/T146231 [18] https://phabricator.wikimedia.org/T143127 [19] https://phabricator.wikimedia.org/T144834 [20] https://phabricator.wikimedia.org/T137495 [21] https://phabricator.wikimedia.org/T145109 [22] https://phabricator.wikimedia.org/T147530 [23] https://phabricator.wikimedia.org/T147128 [24] https://phabricator.wikimedia.org/T146494 [25] https://phabricator.wikimedia.org/T147078 [26] https://phabricator.wikimedia.org/T146400 [27] https://phabricator.wikimedia.org/T147698 [28] https://phabricator.wikimedia.org/T144239 [29] https://phabricator.wikimedia.org/T138501 [30] https://phabricator.wikimedia.org/T146848 [31] https://phabricator.wikimedia.org/T138276 [32] https://phabricator.wikimedia.org/T133247 [33] http://maps.wikimedia.org/img/osm-intl,1,0,0,1000x1000.png?domain=www.media… [34] https://phabricator.wikimedia.org/T146805 [35] https://phabricator.wikimedia.org/T148237 [36] https://phabricator.wikimedia.org/T147987 [37] https://phabricator.wikimedia.org/T148082 [38] https://phabricator.wikimedia.org/T147505: [39] https://phabricator.wikimedia.org/T141156 ---- The full update, and archive of past updates, can be found on Mediawiki.org: https://www.mediawiki.org/wiki/Discovery/Status_updates Interested in getting involved? See tasks marked as Easy or volunteer needed in Phabricator. [1] https://phabricator.wikimedia.org/maniphest/query/qW51XhCCd8.7/#R [2] https://phabricator.wikimedia.org/maniphest/query/5KEPuEJh9TPS/#R -- Yours, Chris Koerner Community Liaison - Discovery Wikimedia Foundation

1 0

Re: [discovery] [Ops] Using Wikidata Query Service in a production context
by Stas Malyshev 13 Oct '16

13 Oct '16

Hi! > 3) Move the service to labs, not providing any firm guarantee of service > level ? I don't see any reason to move it to labs, and I don't think labs has infrastructure to handle it. It can not run on virtual machines - in fact, the only reason why it can't be as stable as we wish is because we don't have hardware to run fully redundant configuration with capacity to serve all cases. I don't see how using labs would help there - does labs have more hardware capacity than production that is available for wdqs use? > I think there was one exception, which is services that needed a lot of > resources so they could not run on vms, but don't we have a prototype of > "labs on real hardware"? I really wouldn't want to run this service on prototype that wasn't tried before and introduce both additional point of failure and unnecessary administration burden. I fail to see any existing issue that this would solve, but it certainly has potential to introduce some. > that you are describing- running easily out of resources (DOS). Even > quarry, which I have publicly complained about in the past, for what you > say, has a better resource management than wqs (30-minute limit > execution, concurrency control, etc.). We have 30-second limit now. People complain about it all the time because it's too short, but see above about the hardware. > icinga. But running an unstable service (wdqs) on top of another > unstable service (wikidata data handling) will never be stable. > Everytime a bot starts writing to wikidata 600 times per second, s5 dbs > shake (that is why we are creating s8) and wqs goes down. :-) > I would suggest using wqs on labs (or anywhere, non-production) with > regular imports rather than real-time updates. Less headaches. I am > literally aiming for that for labsdbs, too. I don't think this is a good scenario. Delayed updates means severely degraded user experience and won't save much performance as the data needs to be moved over anyway. We could save something if we had proper change tracking service that doesn't use recentchages API directly on wikidata (so we get several uncached rc API calls per each edit) but has some faster and more efficient intermediary, but AFAIK we don't have that still. That would be a direction to look if updating is too resource-consuming. -- Stas Malyshev smalyshev(a)wikimedia.org

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Discovery October 2016