I'm glad that work on detecting and addressing harassment are moving forward.
At the same time, I'd appreciate getting a more precise understanding of how WMF is defining the word "harassment". There are legal definitions and dictionary definitions, but I don't think that there is One Definition to Rule Them All. I'm hoping that WMF will be careful to distinguish debate and freedom to express opinions from harassment; we may disagree with minority or fringe views (even views that are offensive to some) but that doesn't necessarily mean that we should use policy and admin tools instead of persuasion and other tools (such as content policies about verifiability and notability) to address them (and in some cases Wikipedia may not be a good place for these discussions). Other distinctions include (1) the distinction between a personal attack and harassment (
https://blog.wikimedia.org/2017/02/07/scaling-understanding-of-harassment/ appears to have
equivocated the two definitions, while English Wikipedia policy makes distinctions between them), and (2) the distinction between a personal attack and an evidence-based critique.
Also note that definitions of what constitutes an attack may vary between languages; for example an expression which sounds insulting to someone in one place, culture, or language may mean something very different or relatively benign in a different place, culture, or language. I had an experience myself when I made a statement to someone which from my perspective was a statement of fact, and the other party took it as an insult. I don't apologize for what I said since from my perspective it was valid, and the other party has not apologized for their reaction, but the point is that defining what constitutes a personal attack or harassment can be a very subjective business and I'm not sure to what extent I would trust an AI to evaluate what constitutes a personal attack or harassment in a wide range of contexts. I get the impression that WMF intends to flag potentially problematic edits for admins to review, which I think could be a good thing, but I hope that there is great care being invested in how the AI is being trained to define personal attacks and harassment, and I wouldn't necessarily want admins to be encouraged to substitute the opinion of an AI for their own.
I understand the desire to tone down some of the more heated discourse around Wikipedia for the sake of improving our user population statistics, and at the same time I'm hoping that we can continue to have very strong support for freedom of expression and differences of opinion. This is a difficult balancing act. I think that moving the needle a bit in the direction of more civility would be a good thing, but I get the impression that there are plenty of edits that are blatant personal attacks that we don't need to move the needle a lot, and could instead focus on more rapidly and thoroughly addressing instances where there is ample evidence that people's intentions were malicious.