Great, thank you!
בתאריך יום ו׳, 4 באוג׳ 2023, 11:49, מאת Luca Toscano < ltoscano@wikimedia.org>:
Hi Amir!
Answering inline:
On Thu, Aug 3, 2023 at 10:11 PM Amir E. Aharoni < amir.aharoni@mail.huji.ac.il> wrote:
The email says that "All ML models currently accessible on ORES are also currently accessible on Lift Wing", and if I understand correctly, this means that this feature in Recent Changes will keep working. Do I understand correctly? :)
Definitely yes, we are working on migrating the ORES extension to Lift Wing, without any change required for users. The tracking task is https://phabricator.wikimedia.org/T319170. At the moment all wikis with the ORES extension enabled, except fi/en/wikidata, are already using models from Lift Wing.
In addition, I have some followup questions:
- The MediaWiki extension that implements the frontend in Recent Changes
is itself named "ORES". It's an internal name that isn't seen much by wiki editors except if they go to Special:Version or to translatewiki. Nevertheless, as the time goes by, seeing the old name may start getting weird. So what's the plan about it? Will this extension remain as is? Will it be renamed? Will it be replaced with a new frontend extension in the foreseeable future?
This is a good question and we don't have a definitive answer at the moment. Our understanding is that renaming extensions in MediaWiki is a long and complicated process, so we'll likely not be able to rename it in the foreseeable future. We would definitely like to add more models to RC Filters, for example Revert Risk (for the curious, see https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Language-ag...), but we are not sure yet if it is worth to create a new extension or just to expand the ORES one. We'll get back to this list as soon as we have a better plan :)
- Back when ORES was originally developed and deployed around 2017,
several wiki editors' communities participated in the development by adapting the product to the needs of their wikis and languages by translating the ORES extension's user interface and, more importantly, by labelling a sample of several thousands of diffs from their wiki using the Wikilabels tool. The communities that did that whole process were, more or less, the communities to which this Recent Changes enhancement was deployed. Will anything like that have to be done again along with the move away from ORES?
The first goal of Lift Wing is to provide a more modern and easy-to-use infrastructure to host models at the WMF, for internal teams and for the community. The focus of the Machine Learning team is to provide infrastructure to run models on, so other teams and the community will be able to propose what to host and we'll vet what is possible and what not (following strict criteria like security of data and PII, model architecture feasibility, etc..). Once a model is deployed on Lift Wing, there will be a team or a community group owning it, namely responsible for its development in terms of features etc.. (more info in https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing#Hosting_a_mode... ). To summarize:
- All the work done so far with ORES models will be preserved, it is
already available on Lift Wing and anybody can use it. We hope that it is now easier to play with model servers and improve them (for WMF and the community), but we are open to any suggestion and feedback about how to improve it. For the curious, more details in the Lift Wing Wikitech page ( https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing).
- The future work will be split into two main areas (as I see it):
** The ML team will keep working on improving the infrastructure, documentation, performance, etc.. of Lift Wing, to provide better tools and data access for any new idea related to models and their usage. We'll maintain the infrastructure with monitoring/alarms/etc.., so the day-to-day ops will not fall on the model owners (WMF and community), so that they will be able to concentrate themselves only on the models and their future steps. ** Other WMF teams like Research will propose and work on new models that the community needs, but we'll also focus on improving what is currently being used. For example, most of the ORES traffic is for the goodfaith and damaging models that worked very well over the years but they rely on old training data and architectures. The Revert Risk models (for example, https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Language-ag...) are an attempt to improve the reliability and performance of the aforementioned models, using a single score instead of multiple ones.
- Will this change open up the possibility of deploying this Recent
Changes enhancement, or a newer version thereof, to more wikis and languages?
It may be possible in the future to enhance even more the RC Filters, at the moment we are concentrating on migrating the current ones to Lift Wing, but after that we'll start figuring out what is the next step. Any suggestion or advice is really welcome! (see https://wikitech.wikimedia.org/wiki/ORES#Machine_Learning_contacts)
If you think that my questions show a wrong understanding of something,
please let me know—as I said in the beginning, its quite possible :)
Thanks a lot for the questions, I hope I answered your doubts, feel free to follow up if anything is missing!
Luca (on behalf of the ML team) _______________________________________________ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/