---------- Forwarded message ----------
From:
Aaron Halfaker <aaron.halfaker@gmail.com>Date: Thu, Mar 16, 2017 at 2:14 PM
Subject: Re: [Wikitech-l] The Revision Scoring weekly update
To: Application of Artificial Intelligence and other advanced computing strategies to Wikimedia Projects <
ai@lists.wikimedia.org>
Cc: wikitech-l <
wikitech-l@lists.wikimedia.org>
Hey folks!
I should really stop calling this a weekly update because it's getting a
bit silly at this point. :) But if it were a weekly update, it would
cover the weeks of 42 - 46.
*Highlights:*
- 3 new models: Finnish Wikipedia (reverted) and Estonian Wikipedia
(damaging & goodfaith)
- We estimated and agreed on funding for ORES servers in the next year
with Operations
- We published a paper about vandalism detection in Wikidata and a blog
post about the massive effect of some initiatives on coverage of Women
Scientists in Wikipedia.
*New development:*
- We added recall-based threshold metrics to the new draftquality model
which should help tool devs know what which new page creations to highlight
for review[1]
- We added optional notices for ORES pages which will help us visually
distinguish our experimental install in WMFlabs from the Prod install (
ores.wikimedia.org)[2]
- We added basic language support for Finish (Thanks 4shadoww)[3] and
deployed a 'reverted' model[4]
- We lead a discussion in Wikidata about "item quality" that resulted in
a Wikipedia 1.0 like scale for Wikidata quality[5,6] and designed a
Wikilabels form to capture the gist of it[7]
- We enabled the ORES Review Tool on Czech Wikipedia[8]
- We configured ChangeProp to use our new minified JSON output to save
bandwidth[9]
- We extended the Estonian language assets (Thanks Cumbril)[10] and
deployed the 'damaging' and 'goodfaith' models[11,12]
- We enabled a testing model for 'goodfaith' on the Beta Cluster to make
it easier for the Collaboration team to run tests with their new filter
interface[13]
- We created a new "precache" endpoint that will allow us to
de-duplicate configuration with ChangeProp and handle all routing in ORES
locally[14]
*Resourcing:*
- We completed a 2 year estimate of ORES resource needs and discussed
funding (capital expendature) for ORES in the coming fiscal year[15]. This
will allow us to continue to grow ORES both in number of models and in
scoring capacity.
*Communications:*
- Amir improved the KDD paper based on review feedback[16] and got it
published[17]
- We published a blob post about our measurements of WikiProject Women
Scientists[18,19] -- "The Keilana Effect"
- Thanks to Cumbril's work, the Estonian labeling campaing was
finished[20]
*Deployments:*
- In early February, we deployed a new set of translations to Wikilabels
(specifcally targeting Romanian Wikipedia)[21]
- In mid-February, we deployed some fixes to ORES documentation and
response formatting[22]
- In mid-March, we deployed 3 new scoring models and ORES notices[23]
*Maintenance and robustness:*
- We fixed a serious issue in the "mwoauth" library that Wikilabels
depends on[24]
- We reduced the number of revisions per request that we could receive
via api.php[25]
- We investigated a scap issue that broke ORES deployment[26]
- We fixed a minor issue with JSON minification behavior[27] and
hard-coding of the location of ORES in the documentation[28]
- We improved performance of ORES filters on MediaWiki[29]
- We improved the language describing ORES behavior on
Special:Contributions[30]
- We added a notice to the Wikipages that Dexbot maintains about its
behavior[31]
- We added notices to
ores.wmflabs.org about it's experimental nature[32]
- We fixed some issues with testing Finnish language assets[33]
- We fixed some styling issues that resulted from an upgrade of OOJS
UI[34]
1.
https://phabricator.wikimedia.org/T157454 -- Add recall based thresholds
to draftquality model
2.
https://phabricator.wikimedia.org/T150962 -- Add an optional notice to
ORES main and ui pages
3.
https://phabricator.wikimedia.org/T158587 -- Add language support for
Finnish
4.
https://phabricator.wikimedia.org/T160228 -- Train/test reverted model
for fiwiki
5.
https://phabricator.wikimedia.org/T157489 -- [Discuss] item quality in
Wikidata
6.
https://www.wikidata.org/wiki/Wikidata:Item_quality
7.
https://phabricator.wikimedia.org/T155828 -- Design item_quality form
for Wikidata
8.
https://phabricator.wikimedia.org/T151611 -- Enable ORES Review Tool on
Czech Wikipedia
9.
https://phabricator.wikimedia.org/T157693 -- Use minified JSON format in
ChangeProp
10.
https://phabricator.wikimedia.org/T160193 -- Extend estonian language
assets from Wiki page
11.
https://phabricator.wikimedia.org/T159608 -- Train/test
damaging/goodfaith models for etwiki
12.
https://phabricator.wikimedia.org/T130280 -- Deploy edit quality models
for etwiki
13.
https://phabricator.wikimedia.org/T160467 -- Enable 'goodfaith' on
testwiki on Beta Cluster
14.
https://phabricator.wikimedia.org/T148714 -- Create generalized
"precache" endpoint for ORES
15.
https://phabricator.wikimedia.org/T157222 -- Estimate ORES capex for
FY2017-18
16.
https://phabricator.wikimedia.org/T148443 -- Improve the KDD paper
based on the review
17.
https://arxiv.org/abs/1703.03861
18.
https://phabricator.wikimedia.org/T160078 -- Blog post about wp10
measurements of Women Scientists
19.
https://blog.wikimedia.org/2017/03/07/the-keilana-effect/
20.
https://phabricator.wikimedia.org/T129702 -- Complete etwiki edit
quality campaign
21.
https://phabricator.wikimedia.org/T157580 -- Deploy Romanian
translations for Wiki labels
22.
https://phabricator.wikimedia.org/T157842 -- Prod deployment of ORES
23.
https://phabricator.wikimedia.org/T160279 -- Deploy ores in prod
(Mid-March)
24.
https://phabricator.wikimedia.org/T157858 -- mwoauth is broken
25.
https://phabricator.wikimedia.org/T157983 -- Reduce the number of
revisions that can be requested in one batch
26.
https://phabricator.wikimedia.org/T157623 -- Investigate failed ORES
deployment
27.
https://phabricator.wikimedia.org/T157721 -- Investigate default JSON
minification behavior in production
28.
https://phabricator.wikimedia.org/T157723 -- ORES swagger is hard-coded
for wmflabs
29.
https://phabricator.wikimedia.org/T152585 -- rcshow=oresreview is slow
30.
https://phabricator.wikimedia.org/T158862 -- Fix message in
Special:Contributions
31.
https://phabricator.wikimedia.org/T158899 -- Add notice about Dexbot
overwriting manual changes to our tracking table.
32.
https://phabricator.wikimedia.org/T159055 -- Add a notice to
ores-wmflabs-deploy about "experimental" nature
33.
https://phabricator.wikimedia.org/T160192 -- Fix testing issues in
finnish language assets
34.
https://phabricator.wikimedia.org/T160258 -- Fix minor styling issues
with OOJS-UI in wikilabels
Sincerely,
Aaron from the Scoring Platform team