complicated process, so we'll likely not be able to
rename it in the
foreseeable future.
Why?
Renaming is usually a bad thing because it often confuses the hell out of
users, but from a technical perspective it is pretty trivial.
--
Bawolff
On Friday, August 4, 2023, Luca Toscano <ltoscano(a)wikimedia.org> wrote:
Hi Amir!
Answering inline:
On Thu, Aug 3, 2023 at 10:11 PM Amir E. Aharoni <
amir.aharoni(a)mail.huji.ac.il> wrote:
The email says that "All ML models currently accessible on ORES are also
currently accessible on Lift Wing", and if I understand correctly, this
means that this feature in Recent Changes will keep working. Do I
understand correctly? :)
Definitely yes, we are working on migrating the ORES extension to Lift
Wing, without any change required for users. The tracking task is
https://phabricator.wikimedia.org/T319170. At the moment all wikis with
the ORES extension enabled, except fi/en/wikidata, are already using models
from Lift Wing.
In addition, I have some followup questions:
1. The MediaWiki extension that implements the frontend in Recent Changes
is itself named "ORES". It's an internal name that isn't seen much by
wiki
editors except if they go to Special:Version or to translatewiki.
Nevertheless, as the time goes by, seeing the old name may start getting
weird. So what's the plan about it? Will this extension remain as is? Will
it be renamed? Will it be replaced with a new frontend extension in the
foreseeable future?
This is a good question and we don't have a definitive answer at the
moment. Our understanding is that renaming extensions in MediaWiki is a
long and complicated process, so we'll likely not be able to rename it in
the foreseeable future. We would definitely like to add more models to RC
Filters, for example Revert Risk (for the curious, see
https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Language-
agnostic_revert_risk), but we are not sure yet if it is worth to create a
new extension or just to expand the ORES one. We'll get back to this list
as soon as we have a better plan :)
2. Back when ORES was originally developed and
deployed around 2017,
several wiki editors' communities participated in the development by
adapting the product to the needs of their wikis and languages by
translating the ORES extension's user interface and, more importantly, by
labelling a sample of several thousands of diffs from their wiki using the
Wikilabels tool. The communities that did that whole process were, more or
less, the communities to which this Recent Changes enhancement was
deployed. Will anything like that have to be done again along with the move
away from ORES?
The first goal of Lift Wing is to provide a more modern and easy-to-use
infrastructure to host models at the WMF, for internal teams and for the
community. The focus of the Machine Learning team is to provide
infrastructure to run models on, so other teams and the community will be
able to propose what to host and we'll vet what is possible and what not
(following strict criteria like security of data and PII, model
architecture feasibility, etc..). Once a model is deployed on Lift Wing,
there will be a team or a community group owning it, namely responsible for
its development in terms of features etc.. (more info in
https://wikitech.wikimedia.org/wiki/Machine_Learning/
LiftWing#Hosting_a_model).
To summarize:
* All the work done so far with ORES models will be preserved, it is
already available on Lift Wing and anybody can use it. We hope that it is
now easier to play with model servers and improve them (for WMF and the
community), but we are open to any suggestion and feedback about how to
improve it. For the curious, more details in the Lift Wing Wikitech page (
https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing).
* The future work will be split into two main areas (as I see it):
** The ML team will keep working on improving the infrastructure,
documentation, performance, etc.. of Lift Wing, to provide better tools and
data access for any new idea related to models and their usage. We'll
maintain the infrastructure with monitoring/alarms/etc.., so the day-to-day
ops will not fall on the model owners (WMF and community), so that they
will be able to concentrate themselves only on the models and their future
steps.
** Other WMF teams like Research will propose and work on new models that
the community needs, but we'll also focus on improving what is currently
being used. For example, most of the ORES traffic is for the goodfaith and
damaging models that worked very well over the years but they rely on old
training data and architectures. The Revert Risk models (for example,
https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/
Language-agnostic_revert_risk) are an attempt to improve the reliability
and performance of the aforementioned models, using a single score instead
of multiple ones.
3. Will this change open up the possibility of
deploying this Recent
Changes enhancement, or a newer version thereof, to more wikis and
languages?
It may be possible in the future to enhance even more the RC Filters, at
the moment we are concentrating on migrating the current ones to Lift Wing,
but after that we'll start figuring out what is the next step. Any
suggestion or advice is really welcome! (see
https://wikitech.
wikimedia.org/wiki/ORES#Machine_Learning_contacts)
If you think that my questions show a wrong understanding of something,
please let me know—as I said in the beginning,
its quite possible :)
Thanks a lot for the questions, I hope I answered your doubts, feel free
to follow up if anything is missing!
Luca (on behalf of the ML team)