Hey folks,

In this update, I'm going to change some things up to try and make this update easier for you to consume. The biggest change you'll notice is that I've broken up the [#] references in each section. I hope that saves you some scrolling and confusion. You'll also notice that I have changed the subject line from "Revision scoring" to "Scoring Platform" because it's now clear that, come July, I'll be leading a new team with that name at the Wikimedia Foundation. There'll be an announcement about that coming once our budget is finalized. I'll try to keep this subject consistent for the foreseeable future so that your email clients will continue to group the updates into one big thread.

Deployments & maintenance:

In this cycle, we've gotten better at tracking our deployments and noting what changes do out with each deployment. You can click on the phab task for a deployment and observe the sub-tasks to find out what was deployed. We had 3 deployments for ORES since mid-march[1,2,3]. We've had two deployments to Wikilabels[4,5] and we've added a maintenance notices for a short period of downtime that's coming up on April 21st[6,7].

1. https://phabricator.wikimedia.org/T160279 -- Deploy ores in prod (Mid-March)

2. https://phabricator.wikimedia.org/T160638 -- Deploy ORES late march

3. https://phabricator.wikimedia.org/T161748 -- Deploy ORES early April

4. https://phabricator.wikimedia.org/T161002 -- Late march wikilabels deployment

5. https://phabricator.wikimedia.org/T163016 -- Deploy Wikilabels mid-April

6. https://phabricator.wikimedia.org/T162888 -- Add header to Wikilabels that warns of upcoming maintenance.

7. https://phabricator.wikimedia.org/T162265 -- Manage wikilabels for labsdb1004 maintenance

Making ORES better:

We've been working to make ORES easier to extend and more useful. ORES now reports it's relevant versions at https://ores.wikimedia.org/versions[8]. We've also reduced the complexity of our "precaching" system that scores edits before you ask for them[9,10]. We're taking advantage of logstash to store and query our logs[11]. We've also implemented some nice abstractions for requests and responses in ORES[12] that allowed us to improve our metrics tracking substantially[13].

8. https://phabricator.wikimedia.org/T155814 -- Expose version of the service and its dependencies

9. https://phabricator.wikimedia.org/T148714 -- Create generalized "precache" endpoint for ORES

10. https://phabricator.wikimedia.org/T162627 -- Switch `/precache` to be a POST end point

11. https://phabricator.wikimedia.org/T149010 -- Send ORES logs to logstash

12. https://phabricator.wikimedia.org/T159502 -- Exclude precaching requests from cache_miss/cache_hit metrics

13. https://phabricator.wikimedia.org/T161526 -- Implement ScoreRequest/ScoreResponse pattern in ORES

New functionality:

In the last month and a half, we've added basic support to Korean Wikipedia[14,15]. Props to Revi for helping us work through a bunch of issues with our Korean language support[16,17,18].

We've also gotten the ORES Review tool deployed to Hebrew Wikipedia[19,20,21,22] and Estonian Wikipedia[23,24,25]. We're also working with the Collaboration team to implement the threshold test statistics that they need to tune their new Edit Review interface[26] and we're working towards making this kind of work self-serve so that that product team and other tool developers won't have to wait on us to implement these threshold stats in the future[27].

14. https://phabricator.wikimedia.org/T161617 -- Deploy reverted model for kowiki

15. https://phabricator.wikimedia.org/T161616 -- Train/test reverted model for kowiki

16. https://phabricator.wikimedia.org/T160752 -- Korean generated word lists are in chinese

17. https://phabricator.wikimedia.org/T160757 -- Add language support for Korean

18. https://phabricator.wikimedia.org/T160755 -- Fix tokenization for Korean

19. https://phabricator.wikimedia.org/T161621 -- Deploy ORES Review Tool for hewiki

20. https://phabricator.wikimedia.org/T130284 -- Deploy edit quality models for hewiki

21. https://phabricator.wikimedia.org/T160930 -- Train damaging and goodfaith models for hewiki

22. https://phabricator.wikimedia.org/T130263 -- Complete hewiki edit quality campaign

23. https://phabricator.wikimedia.org/T159609 -- Deploy ORES review tool to etwiki

24. https://phabricator.wikimedia.org/T130280 -- Deploy edit quality models for etwiki

25. https://phabricator.wikimedia.org/T129702 -- Complete etwiki edit quality campaign

26. https://phabricator.wikimedia.org/T162377 -- Implement additional test_stats in editquality

27. https://phabricator.wikimedia.org/T162217 -- Implement "thresholds", deprecate "pile of tests_stats"

ORES training / labeling campaigns:

Thanks to a lot of networking at Wikimedia Conference and some help from Ijon (Asaf Batrov), we've found a bunch of new collaborators to help us deploy ORES to new wikis. As is critcial in this process, we need to deploy labeling campaigns so that Wikipedians can help us train ORES.

We've got new editquality labeling campaigns deployed to Albanian[28], Finnish[29], Latvian[30], Korean[31], and Turkish[21] Wikipedias.

We've also been working on a new type of model: "Item quality" in Wikidata. We've deployed, labeled, and analyzed a pilot[33], fixed some critical bugs that came up[34,35], and we've finally launched a 5k item campaign which is already 17% done[36]! See https://www.wikidata.org/wiki/Wikidata:Item_quality_campaign if you'd like to help us out.

28. https://phabricator.wikimedia.org/T161981 -- Edit quality campaign for Albanian Wikipedia

29. https://phabricator.wikimedia.org/T161905 -- Edit quality campaign for Finnish Wikipedia

30. https://phabricator.wikimedia.org/T162032 -- Edit quality campaign for Latvian Wikipedia

31. https://phabricator.wikimedia.org/T161622 -- Deploy editquality campaign in Korean Wikipedia

32. https://phabricator.wikimedia.org/T161977 -- Start v2 editquality campaign for trwiki

33. https://phabricator.wikimedia.org/T159570 -- Deploy the pilot of Wikidata item quality campaign

34. https://phabricator.wikimedia.org/T160256 -- Wikidata items render badly in Wikilabels

35. https://phabricator.wikimedia.org/T162530 -- Implement "unwanted pages" filtering strategy for Wikidata

36. https://phabricator.wikimedia.org/T157493 -- Deploy Wikidata item quality campaign

Bug fixing:

As usual, we have a few weird bug that got in our way. We needed to move to a bigger virtual machine in "Beta Labs" because our models take up a bunch of hard drive space[37]. We found that Wikilabels wasn't removing expired tasks correctly and that this was making it difficult to finish labeling campaigns[38]. We also had a lot of right-to-left issues when we did an upgrade of OOjs UI[38]. There was an old bug we had with https://translatewiki.net in one of our message keys[39].

37. https://phabricator.wikimedia.org/T160762 -- deployment-ores-redis /srv/ redis is too small (500MBytes)

38. https://phabricator.wikimedia.org/T161521 -- Wikilabels is not cleaning up expired tasks for Wikidata item quality campaign

39. https://phabricator.wikimedia.org/T161533 -- Fix RTL issues in Wikilabels after OOjs UI upgrade

40. https://phabricator.wikimedia.org/T132197 -- qqq for a wiki-ai message cannot be loaded

-Aaron

Principal Research Scientist

Head of the Scoring Platform Team