Hey folks,
In this update, I'm going to change some things up to try and make this update easier for you to consume. The biggest change you'll notice is that I've broken up the [#] references in each section. I hope that saves you some scrolling and confusion. You'll also notice that I have changed the subject line from "Revision scoring" to "Scoring Platform" because it's now clear that, come July, I'll be leading a new team with that name at the Wikimedia Foundation. There'll be an announcement about that coming once our budget is finalized. I'll try to keep this subject consistent for the foreseeable future so that your email clients will continue to group the updates into one big thread.
*Deployments & maintenance:*
In this cycle, we've gotten better at tracking our deployments and noting what changes do out with each deployment. You can click on the phab task for a deployment and observe the sub-tasks to find out what was deployed. We had 3 deployments for ORES since mid-march[1,2,3]. We've had two deployments to Wikilabels[4,5] and we've added a maintenance notices for a short period of downtime that's coming up on April 21st[6,7].
1. https://phabricator.wikimedia.org/T160279 -- Deploy ores in prod (Mid-March) 2. https://phabricator.wikimedia.org/T160638 -- Deploy ORES late march 3. https://phabricator.wikimedia.org/T161748 -- Deploy ORES early April 4. https://phabricator.wikimedia.org/T161002 -- Late march wikilabels deployment 5. https://phabricator.wikimedia.org/T163016 -- Deploy Wikilabels mid-April 6. https://phabricator.wikimedia.org/T162888 -- Add header to Wikilabels that warns of upcoming maintenance. 7. https://phabricator.wikimedia.org/T162265 -- Manage wikilabels for labsdb1004 maintenance
*Making ORES better:*
We've been working to make ORES easier to extend and more useful. ORES now reports it's relevant versions at https://ores.wikimedia.org/versions%5B8]. We've also reduced the complexity of our "precaching" system that scores edits before you ask for them[9,10]. We're taking advantage of logstash to store and query our logs[11]. We've also implemented some nice abstractions for requests and responses in ORES[12] that allowed us to improve our metrics tracking substantially[13].
8. https://phabricator.wikimedia.org/T155814 -- Expose version of the service and its dependencies 9. https://phabricator.wikimedia.org/T148714 -- Create generalized "precache" endpoint for ORES 10. https://phabricator.wikimedia.org/T162627 -- Switch `/precache` to be a POST end point 11. https://phabricator.wikimedia.org/T149010 -- Send ORES logs to logstash 12. https://phabricator.wikimedia.org/T159502 -- Exclude precaching requests from cache_miss/cache_hit metrics 13. https://phabricator.wikimedia.org/T161526 -- Implement ScoreRequest/ScoreResponse pattern in ORES
*New functionality:*
In the last month and a half, we've added basic support to Korean Wikipedia[14,15]. Props to Revi for helping us work through a bunch of issues with our Korean language support[16,17,18].
We've also gotten the ORES Review tool deployed to Hebrew Wikipedia[19,20,21,22] and Estonian Wikipedia[23,24,25]. We're also working with the Collaboration team to implement the threshold test statistics that they need to tune their new Edit Review interface[26] and we're working towards making this kind of work self-serve so that that product team and other tool developers won't have to wait on us to implement these threshold stats in the future[27].
14. https://phabricator.wikimedia.org/T161617 -- Deploy reverted model for kowiki 15. https://phabricator.wikimedia.org/T161616 -- Train/test reverted model for kowiki 16. https://phabricator.wikimedia.org/T160752 -- Korean generated word lists are in chinese 17. https://phabricator.wikimedia.org/T160757 -- Add language support for Korean 18. https://phabricator.wikimedia.org/T160755 -- Fix tokenization for Korean 19. https://phabricator.wikimedia.org/T161621 -- Deploy ORES Review Tool for hewiki 20. https://phabricator.wikimedia.org/T130284 -- Deploy edit quality models for hewiki 21. https://phabricator.wikimedia.org/T160930 -- Train damaging and goodfaith models for hewiki 22. https://phabricator.wikimedia.org/T130263 -- Complete hewiki edit quality campaign 23. https://phabricator.wikimedia.org/T159609 -- Deploy ORES review tool to etwiki 24. https://phabricator.wikimedia.org/T130280 -- Deploy edit quality models for etwiki 25. https://phabricator.wikimedia.org/T129702 -- Complete etwiki edit quality campaign 26. https://phabricator.wikimedia.org/T162377 -- Implement additional test_stats in editquality 27. https://phabricator.wikimedia.org/T162217 -- Implement "thresholds", deprecate "pile of tests_stats"
*ORES training / labeling campaigns:*
Thanks to a lot of networking at Wikimedia Conference and some help from Ijon (Asaf Batrov), we've found a bunch of new collaborators to help us deploy ORES to new wikis. As is critcial in this process, we need to deploy labeling campaigns so that Wikipedians can help us train ORES.
We've got new editquality labeling campaigns deployed to Albanian[28], Finnish[29], Latvian[30], Korean[31], and Turkish[21] Wikipedias.
We've also been working on a new type of model: "Item quality" in Wikidata. We've deployed, labeled, and analyzed a pilot[33], fixed some critical bugs that came up[34,35], and we've finally launched a 5k item campaign which is already 17% done[36]! See https://www.wikidata.org/wiki/Wikidata:Item_quality_campaign if you'd like to help us out.
28. https://phabricator.wikimedia.org/T161981 -- Edit quality campaign for Albanian Wikipedia 29. https://phabricator.wikimedia.org/T161905 -- Edit quality campaign for Finnish Wikipedia 30. https://phabricator.wikimedia.org/T162032 -- Edit quality campaign for Latvian Wikipedia 31. https://phabricator.wikimedia.org/T161622 -- Deploy editquality campaign in Korean Wikipedia 32. https://phabricator.wikimedia.org/T161977 -- Start v2 editquality campaign for trwiki 33. https://phabricator.wikimedia.org/T159570 -- Deploy the pilot of Wikidata item quality campaign 34. https://phabricator.wikimedia.org/T160256 -- Wikidata items render badly in Wikilabels 35. https://phabricator.wikimedia.org/T162530 -- Implement "unwanted pages" filtering strategy for Wikidata 36. https://phabricator.wikimedia.org/T157493 -- Deploy Wikidata item quality campaign
*Bug fixing:*
As usual, we have a few weird bug that got in our way. We needed to move to a bigger virtual machine in "Beta Labs" because our models take up a bunch of hard drive space[37]. We found that Wikilabels wasn't removing expired tasks correctly and that this was making it difficult to finish labeling campaigns[38]. We also had a lot of right-to-left issues when we did an upgrade of OOjs UI[38]. There was an old bug we had with https://translatewiki.net in one of our message keys[39].
37. https://phabricator.wikimedia.org/T160762 -- deployment-ores-redis /srv/ redis is too small (500MBytes) 38. https://phabricator.wikimedia.org/T161521 -- Wikilabels is not cleaning up expired tasks for Wikidata item quality campaign 39. https://phabricator.wikimedia.org/T161533 -- Fix RTL issues in Wikilabels after OOjs UI upgrade 40. https://phabricator.wikimedia.org/T132197 -- qqq for a wiki-ai message cannot be loaded
-Aaron Principal Research Scientist Head of the Scoring Platform Team