New models coming to ORES & notes - AI

20 Aug 2016

Hey folks,

We've been working on generating some updated models for ORES.  These
models will behave slightly differently from the models that we currently
have deployed.  This is a natural artifact of retraining the models on the
*exact same data* again because of some random properties of the learning
algorithms.  So, for the most part, this should be a non-issue for any
tools that use ORES.  However, I wanted to take this opportunity to
highlight some of the facilities ORES provides to help automatically detect
and adjust for these types of changes.

*== Versions ==*
ORES provides information about all of the models.  This information
includes a model version number.  If you are caching ORES scores locally,
we recommend invalidating old scores whenever this model number changes.
For example, https://ores.wikimedia.org/v2/scores/enwiki/damaging/12345678
currently returns

{
  "scores": {
    "enwiki": {
      "damaging": {
        "scores": {
          "12345678": {
            "prediction": false,
            "probability": {
              "false": 0.7141333465390294,
              "true": 0.28586665346097057
            }
          }
        },
        "version": "0.1.1"
      }
    }
  }
}

This score was generated with the "0.1.1" version of the model.  But once
we deploy the new models, the same request will return:
{
  "scores": {
    "enwiki": {
      "damaging": {
        "scores": {
          "12345678": {
            "prediction": false,
            "probability": {
              "false": 0.8204647324045306,
              "true": 0.17953526759546945
            }
          }
        },
        "version": "0.1.2"
      }
    }
  }
}

Note that the version number changes to "0.1.2" and the probabilities
change slightly.  In this case, we're essentially re-training the same
model in a similar way, so we increment the "patch" number.

However, we're switching modeling strategies for the article quality models
(enwiki-wp10, frwiki-wp10 & ruwiki-wp10), so those versions increment the
minor version from "0.3.2" to "0.4.0".  You may see more substantial
changes in prediction probabilities with those models, but a quick
spot-checking suggests that the changes are not substantial.

*== Test statistics and threshholding ==*
So, many tools that use our edit quality models (reverted, damaging and
goodfaith) will set threshholds for flagging edits for review.  In order to
support these tools, we produce test statistics that suggest useful
thresholds.

https://ores.wmflabs.org/v2/scores/enwiki/damaging/?model_info=test_stats
produces:

      ...
            "filter_rate_at_recall(min_recall=0.75)": {
              "filter_rate": 0.869,
              "recall": 0.752,
              "threshold": 0.492
            },
            "filter_rate_at_recall(min_recall=0.9)": {
              "filter_rate": 0.753,
              "recall": 0.902,
              "threshold": 0.173
            },
      ...

These two statistics show useful thresholds for detecting damaging edits.
E.g. if you want to be sure that you catch nearly all vandalism (and are OK
with a higher false-positive rate), set the threshold at 0.173, but if
you'd like to catch most vandalism with almost no false-positives, set the
threshold at 0.492.  These fields can be read automatically by tools so
that they do not need to be manually updated every time that we deploy a
new model.

Let me know if you have any questions and happy hacking!

-Aaron