Hi!
I'll leave the comments related to model architecture and behavior to
others (more expert than me), I'd like to comment on the
process/infrastructure parts :)
On Sat, Sep 23, 2023 at 9:03 PM Strainu <strainu10(a)gmail.com> wrote:
Hi folks,
So glad to see the old and new ML teams have an open discussion about this
subject.
I understand that the team might prefer to have several tickets for
different issues, but the discussion about the general approach to the
different models is of interest to many people and is more easily digested
on email. I would suggest to continue discussing the merits of the current
strategy (and not necessarily of a model or another) on email.
I proposed Phabricator tasks because I think that they better target
different broad subjects, it is easier to involve specific teams/people and
to define the goal of the conversations. In this big email thread we
started outlining the migration/deprecation of ORES in favor of Lift Wing,
and now we are talking about model architectures and strategies to use for
various use cases in the future. I really like the conversation, but if we
wanted to be strict a new email thread (with a different subject) should be
created, instead of mixing multiple subjects. People interested in the Lift
Wing migration wouldn't be able to add comments, or if they did it would
become difficult to follow all the discussions.
As stated before, I'll clarify the "deprecation" term mentioned in Wikitech
for the various revscoring-based models, but it is not something that is
related to the Lift Wing migration (since all models present on ORES are
also on Lift Wing). It is a long term and wider project that will happen
over the upcoming months/years, and that requires a broader discussion.
This is why I propose to discuss models on Phabricator, rather than
Wikitech-l :)
On the long run, I believe an unique model good enough
can be developed
for revert bots. However, it would be great if there were some clear
quality criteria that the community can verify and the old models are
maintained for a wiki until we are sure the new model passes that criteria
on that wiki.
Definitely, I just want to make it clear that the ML team has no intention
to force any choice to the community, we are just trying to optimize our
infrastructure to serve a wide variety of models and in the process we have
to choose the best strategy to follow. On Lift Wing we require that every
new model has a model card that explains how it works, how it was trained,
best use cases, etc.. For example, these are the API Portal's pages for the
two Revert Risk models (they contain the link to model cards):
https://api.wikimedia.org/wiki/Lift_Wing_API/Reference/Get_reverted_risk_la…
https://api.wikimedia.org/wiki/Lift_Wing_API/Reference/Get_reverted_risk_mu…
A change in hosting should not be the guiding force in
any team's roadmap,
but the needs of its users.
I hope that we (as ML) didn't describe our intentions in the wrong way,
since our aim is absolutely not to impose anything, but to improve our
infrastructure to better serve users in the future (WMF internal use cases
and the community). ORES served us well over the years, it was a pioneering
project on a topic, ML, that was only discussed in Research papers and some
futuristic set of libraries at the time. Big players were already working
on it internally, but there was no clear guidance or standards, and over
the years stuff like MLOps formed and nowadays they are the de facto
standard to operate. We are trying to follow those best practices, because
we are convinced that they will surely improve and ease the process to
build and publish a model at the WMF.
If you are curious, the ML team worked a lot on documentation, see
https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing. We tried as
best as we could to make the transition smooth and to highlight new
features and improvements for every user.
To summarize - we, as ML team, have created Lift Wing to serve the
community and our internal use cases, and we wouldn't remove support or
change dramatically how the community operates without a gradual migration
path and proposing new solutions first. During the migration to Lift Wing
we asked folks to test Revert Risk models, instead of goodfaith/damanging
ones, and the solution seems to have suited a lot of use cases. Maybe in
the future we'll have a mixture of specialized models for certain wikis,
and more "multi-purpose" ones, but finding the right solution will surely
involve community feedback and several tries.
Thanks for the feedback!
Luca