James said:
revision scoring as a service will
not actually categorize the nature of what it is learning.
See
https://en.wikipedia.org/wiki/Wikipedia:Labels/Edit_quality We're almost ready to train and deploy a model with some nuance in it's prediction based on the *reason* that something should be reverted. (E.g. damaging/not-damaging and good-faith/bad-faith) We already have the labeling campaign done for Portuguese Wikipedia and we're nearly done for Turkish, Persian and English.
Beyond that work, I think there's a fun clustering project to be done here to discover categories of revert reasons. I'm always looking for collaborators to advise on these types of fun projects. *hint hint*
But when it comes down to it, I think our best measures of value-added won't be the output of a machine classifier, but rather some careful work in measurement theory. As Pine hopes ("assigning value to edits or editors; I would still like that project to go forward."), the project is continuing to move forward -- just slower than I had planned. Due to the massive interest in Revision Scoring, I've been putting a lot more of my time there recently.
Again, I'm always looking for collaborators on these projects. I do as much work as I can to get them online and I have a small team working with me, but we can always use a hand. There are lots of ways to contribute. You don't need to code. We need help labeling edits, doing outreach in new wikis that we'd like to support and translating our software and docs.
-Aaron