[AI] The Revision scoring weekly update

10 Nov 2016

Hey,

This is the 29th weekly update from revision scoring team that we have sent
to this mailing list.

Deployments:

   - We deployed logging changes to ORES that will reduce the verbosity[1]

   - We also deployed revscoring 1.3.0 and new models built with it to WMF
   labs[2].  This won't change anything important from a user-perspective, but
   it paves the way for developing new modeling strategies.

Maintenance and robustness:

   - We fixed puppet so that log file directories are also created on the
   celery worker nodes (affects wmflabs)[3]

   - We fixed an issue with our recall_at_fpr metrics which was incorrectly
   defined and implemented a recall_at_precision metric to take its place[4]

New development:

   - We've made a lot of progress on modeling sentences and have just
   started experimenting with a sentence model from featured articles[5]

   - We're reviewing a dataset of spam/vandalism/attack new page creations
   for public release[6].  This dataset will help our collaborators work with
   us on modeling the quality of drafts and supporting new page triage.

1. https://phabricator.wikimedia.org/T149730 -- Deploy logging changes to
ORES
2. https://phabricator.wikimedia.org/T150447 -- Deploy revscoring 1.3.0 and
updated editquality and wikiclass to wmflabs
3. https://phabricator.wikimedia.org/T149925 -- /srv/log/ores/ not created
on worker nodes
4. https://phabricator.wikimedia.org/T149825 -- Implement recall at
precision (and fix FPR metrics)
5. https://phabricator.wikimedia.org/T148867 -- Implement sentences
datascources & experiment with normalization.
6. https://phabricator.wikimedia.org/T150307 -- Create manually vetted
dataset of spam/vandalism/attack pages

Sincerely,
Aaron from the Revision Scoring team

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

[AI] The Revision scoring weekly update