Hey folks,
In this update, I'm going to change some things up to try and make this
update easier for you to consume. The biggest change you'll notice is that
I've broken up the [#] references in each section. I hope that saves you
some scrolling and confusion. You'll also notice that I have changed the
subject line from "Revision scoring" to "Scoring Platform" because
it's now
clear that, come July, I'll be leading a new team with that name at the
Wikimedia Foundation. There'll be an announcement about that coming once
our budget is finalized. I'll try to keep this subject consistent for the
foreseeable future so that your email clients will continue to group the
updates into one big thread.
*Deployments & maintenance:*
In this cycle, we've gotten better at tracking our deployments and noting
what changes do out with each deployment. You can click on the phab task
for a deployment and observe the sub-tasks to find out what was deployed.
We had 3 deployments for ORES since mid-march[1,2,3]. We've had two
deployments to Wikilabels[4,5] and we've added a maintenance notices for a
short period of downtime that's coming up on April 21st[6,7].
1.
https://phabricator.wikimedia.org/T160279 -- Deploy ores in prod
(Mid-March)
2.
https://phabricator.wikimedia.org/T160638 -- Deploy ORES late march
3.
https://phabricator.wikimedia.org/T161748 -- Deploy ORES early April
4.
https://phabricator.wikimedia.org/T161002 -- Late march wikilabels
deployment
5.
https://phabricator.wikimedia.org/T163016 -- Deploy Wikilabels mid-April
6.
https://phabricator.wikimedia.org/T162888 -- Add header to Wikilabels
that warns of upcoming maintenance.
7.
https://phabricator.wikimedia.org/T162265 -- Manage wikilabels for
labsdb1004 maintenance
*Making ORES better:*
We've been working to make ORES easier to extend and more useful. ORES now
reports it's relevant versions at
https://ores.wikimedia.org/versions[8].
We've also reduced the complexity of our "precaching" system that scores
edits before you ask for them[9,10]. We're taking advantage of logstash to
store and query our logs[11]. We've also implemented some nice
abstractions for requests and responses in ORES[12] that allowed us to
improve our metrics tracking substantially[13].
8.
https://phabricator.wikimedia.org/T155814 -- Expose version of the
service and its dependencies
9.
https://phabricator.wikimedia.org/T148714 -- Create generalized
"precache" endpoint for ORES
10.
https://phabricator.wikimedia.org/T162627 -- Switch `/precache` to be a
POST end point
11.
https://phabricator.wikimedia.org/T149010 -- Send ORES logs to logstash
12.
https://phabricator.wikimedia.org/T159502 -- Exclude precaching
requests from cache_miss/cache_hit metrics
13.
https://phabricator.wikimedia.org/T161526 -- Implement
ScoreRequest/ScoreResponse pattern in ORES
*New functionality:*
In the last month and a half, we've added basic support to Korean
Wikipedia[14,15]. Props to Revi for helping us work through a bunch of
issues with our Korean language support[16,17,18].
We've also gotten the ORES Review tool deployed to Hebrew
Wikipedia[19,20,21,22] and Estonian Wikipedia[23,24,25]. We're also
working with the Collaboration team to implement the threshold test
statistics that they need to tune their new Edit Review interface[26] and
we're working towards making this kind of work self-serve so that that
product team and other tool developers won't have to wait on us to
implement these threshold stats in the future[27].
14.
https://phabricator.wikimedia.org/T161617 -- Deploy reverted model for
kowiki
15.
https://phabricator.wikimedia.org/T161616 -- Train/test reverted model
for kowiki
16.
https://phabricator.wikimedia.org/T160752 -- Korean generated word
lists are in chinese
17.
https://phabricator.wikimedia.org/T160757 -- Add language support for
Korean
18.
https://phabricator.wikimedia.org/T160755 -- Fix tokenization for Korean
19.
https://phabricator.wikimedia.org/T161621 -- Deploy ORES Review Tool
for hewiki
20.
https://phabricator.wikimedia.org/T130284 -- Deploy edit quality models
for hewiki
21.
https://phabricator.wikimedia.org/T160930 -- Train damaging and
goodfaith models for hewiki
22.
https://phabricator.wikimedia.org/T130263 -- Complete hewiki edit
quality campaign
23.
https://phabricator.wikimedia.org/T159609 -- Deploy ORES review tool to
etwiki
24.
https://phabricator.wikimedia.org/T130280 -- Deploy edit quality models
for etwiki
25.
https://phabricator.wikimedia.org/T129702 -- Complete etwiki edit
quality campaign
26.
https://phabricator.wikimedia.org/T162377 -- Implement additional
test_stats in editquality
27.
https://phabricator.wikimedia.org/T162217 -- Implement "thresholds",
deprecate "pile of tests_stats"
*ORES training / labeling campaigns:*
Thanks to a lot of networking at Wikimedia Conference and some help from
Ijon (Asaf Batrov), we've found a bunch of new collaborators to help us
deploy ORES to new wikis. As is critcial in this process, we need to
deploy labeling campaigns so that Wikipedians can help us train ORES.
We've got new editquality labeling campaigns deployed to Albanian[28],
Finnish[29], Latvian[30], Korean[31], and Turkish[21] Wikipedias.
We've also been working on a new type of model: "Item quality" in
Wikidata. We've deployed, labeled, and analyzed a pilot[33], fixed some
critical bugs that came up[34,35], and we've finally launched a 5k item
campaign which is already 17% done[36]! See
https://www.wikidata.org/wiki/Wikidata:Item_quality_campaign if you'd like
to help us out.
28.
https://phabricator.wikimedia.org/T161981 -- Edit quality campaign for
Albanian Wikipedia
29.
https://phabricator.wikimedia.org/T161905 -- Edit quality campaign for
Finnish Wikipedia
30.
https://phabricator.wikimedia.org/T162032 -- Edit quality campaign for
Latvian Wikipedia
31.
https://phabricator.wikimedia.org/T161622 -- Deploy editquality
campaign in Korean Wikipedia
32.
https://phabricator.wikimedia.org/T161977 -- Start v2 editquality
campaign for trwiki
33.
https://phabricator.wikimedia.org/T159570 -- Deploy the pilot of
Wikidata item quality campaign
34.
https://phabricator.wikimedia.org/T160256 -- Wikidata items render
badly in Wikilabels
35.
https://phabricator.wikimedia.org/T162530 -- Implement "unwanted pages"
filtering strategy for Wikidata
36.
https://phabricator.wikimedia.org/T157493 -- Deploy Wikidata item
quality campaign
*Bug fixing:*
As usual, we have a few weird bug that got in our way. We needed to move
to a bigger virtual machine in "Beta Labs" because our models take up a
bunch of hard drive space[37]. We found that Wikilabels wasn't removing
expired tasks correctly and that this was making it difficult to finish
labeling campaigns[38]. We also had a lot of right-to-left issues when we
did an upgrade of OOjs UI[38]. There was an old bug we had with
https://translatewiki.net in one of our message keys[39].
37.
https://phabricator.wikimedia.org/T160762 -- deployment-ores-redis
/srv/ redis is too small (500MBytes)
38.
https://phabricator.wikimedia.org/T161521 -- Wikilabels is not cleaning
up expired tasks for Wikidata item quality campaign
39.
https://phabricator.wikimedia.org/T161533 -- Fix RTL issues in
Wikilabels after OOjs UI upgrade
40.
https://phabricator.wikimedia.org/T132197 -- qqq for a wiki-ai message
cannot be loaded
-Aaron
Principal Research Scientist
Head of the Scoring Platform Team