Just a quick thought that I shared in IRC earlier.
AI isn't magical. It's pretty cool, but you're not going to have a
> conversation with ORES
> <https://meta.wikimedia.org/wiki/Objective_Revision_Evaluation_Service>.
>
It's not false that we are closer to strong "conversational" AI than ever
before. Still, in practical terms, we're pretty far away from not needing
to program anymore. I find that articles like this are more fantastical
than informative. I guess it is interesting to think about where we'll be
when we can have an abstract conversation with a computer system rather
than the rigid specifics of programming, but I'm with Brian -- this seems
to be a cycle. Though, I'd say the media does boom and bust, but the
research carries on relatively consistently since AI researchers are
usually less interested in the hype.
In the ORES project, we're using the most simplistic "AIs" available --
classifiers. Still these dumb AIs can still help us to do amazing things
(e.g. review all of RecentChanges in 50x faster or augment article
histories with information about the *type of change* made). IMO, it's
these amazing and powerful things that dumb, non-conversational AIs can do
that is very powerful and a little scary
<http://www.nytimes.com/2016/05/19/opinion/the-real-bias-built-in-at-faceboo…>.
We're hardly taking advantage of that at all. I think that's where the
next big revolution with AI is taking place right now. It's going to
change a lot of things and infect many aspects of our life (and in many
ways it already has).
-Aaron
On Fri, May 20, 2016 at 2:43 PM, Purodha Blissenbach <
purodha(a)blissenbach.org> wrote:
> I see only an ad to support Wired.
> Purodha
>
>
> On 20.05.2016 20:11, Pine W wrote:
>
>> Seems like a good summary: http://www.wired.com/2016/05/the-end-of-code/
>>
>> Comments welcome, especially from Wikimedia AI experts who are working on
>> ORES.
>>
>> Pine
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikitech-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
Hey,
Here is the weekly update for the Revision Scoring project for the week of
May 9th through May 15th.
*New developments:*
- We do have a dedicated help page on how to request that we add support
for new languages[1]
- We deployed new version of revscoring and ORES, The biggest
improvement is speed. Improvments may vary in different wikis but for
English Wikipedia is about 20% [2]
- We are pre-generating list of bad words for different langauges. [3]
- shinken and icinga now report outages and recovery on #wikimedia-ai --
our main work channel[4]
*Maintenance and robustness:*
- Soon, Your unlabeled edits in Wikilabels will be made available to
others after 24 hours. [5]
- We improved logging of scoring errors in ORES [6]
1. https://phabricator.wikimedia.org/T135179
2. https://phabricator.wikimedia.org/T135381
3. https://phabricator.wikimedia.org/T134629
4. https://phabricator.wikimedia.org/T134726
5. https://phabricator.wikimedia.org/T134619
6. https://phabricator.wikimedia.org/T135399
Hello,
Today, Wikilabels database will get an kernel update. Which means the
database will be down for about three minutes. In that moment you can
access to the wikilabels server [1] or any of its test servers such as
labels-staging or labels-experiment but you would be unable to make a
change in the database, get stats, or any other thing that requires
database access.
Thank you for your patience
[1]: labels.wmflabs.org
Best
Labs had a DNS blip that caused ORES to be down for a few minutes (between
14:59 and 15:04 UTC) today. Everything seems to be back to normal now.
>From #wikimedia-labs <irc://irc.freenode.net/wikimedia-labs>connect
<https://webchat.freenode.net/?channels=#wikimedia-labs>:
[10:32:25] <YuviPanda> halfak: temp dns blip
[10:32:36] <halfak> Gotcha. Thanks YuviPanda
[10:32:57] <halfak> was it big enough to warrant a write-up?
[10:33:13] <halfak> If not, I'll just post "temp DNS blib" to my ORES
users and call it good.
[10:33:39] <YuviPanda> halfak: probably not, since we're doing a bunch
of DNS stuff in the next few days to shore up DNS
[10:34:20] <halfak> kk
-Aaron
Hello,
TLDR: Vandalism detection model for Wikidata just got much more accurate.
Longer version:
ORES is designed to handle different types of classification. For example
one of under development classification types is "wikiclass" which
determines type of edits. If they are adding content, or fixing mistake,
etc.
The most mature classification of ORES is edit quality. Whether an edit is
vandalism or not. We usually have three models: "reverted" model. Training
data for this model is obtained automatically. We sample around 20K edits
(for Wikidata it was different) and we consider an edit as vandalism if
they are reverted within a certain time period after the edit (7 days for
Wikidata).
On the other hand, "damaging" and "goodfaith" models are more accurate
because we sample about 20K edits. Prelabel edits that being made by
trusted users such as admins and bots as not harmful to Wikidata/Wikipedia
and then we ask users to label the rest. (For Wikidata it was around 4K
edits) Since most edits in Wikidata are made by bots and trusted users, We
altered this method for Wikidata a bit but the whole process is the same.
Don't forget that since it's human judgement, this models are more accurate
and useful to damage detection. The ORES extension uses "damaging" model
and not "reverted" model, thus having "damaging" model online is a
requirement for the extension deployment.
People label edits that if an edit is damaging to Wikidata and if the edit
is made by good intention. So we have three cases: 1- An edit is harmful to
Wikidata but made with good intention. An honest/newbie mistake 2- An edit
is harmful and made bad intention. A vandalism 3- A edit with good
intention and productive. A "good" edit".
Biggest reason to distinguish between honest mistakes and vandalisms is
that using anti-vandalism bots caused reducing on new user retention in
Wikis [1]. So future anti-vandalism bots should not revert good faith
mistakes but report them for human review.
One of good things about Wikidata damage detection labeling process is that
so many people were involved (we had 38 labelers for Wikidata[2]). Another
good thing is that its fitness very high in terms of AI [3]. But since
number of damaging edits and not damaging edits are not the same, scores it
gives to edits are not intuitive. Let me give you an example: In our
damaging model if an edit is scored less than 80% it's probably not
vandalism. Actually, in a very huge sampling of human edits we had for
reverted model we couldn't find a bad edit with score lower than 93% i.e.
If an edit is scored 92% in reverted model, you are pretty sure it's not
vandalism. Please reach out to us if you have any questions on using these
scores. Please reach out to us if have any questions in general ;)
In terms of needed changes, ScoredRevision gadget is set automatically to
prefer the damaging model. I just changed my bot in #wikidata-vandalism
channel in order to use damaging instead of reverted.
If you want to use these models. Check out our docs [4]
Sincerely,
Revision scoring team [5]
[1]: Halfaker, A.; Geiger, R. S.; Morgan, J. T.; Riedl, J. (28 December
2012). "The Rise and Decline of an Open Collaboration System: How
Wikipedia's Reaction to Popularity Is Causing Its Decline". *American
Behavioral Scientist* *57* (5): 664–688.
[2]: https://labels.wmflabs.org/campaigns/wikidatawiki/?campaigns=stats
[3]: https://ores.wmflabs.org/scores/wikidatawiki/?model_info
[4]: https://ores.wmflabs.org/v2/
[5]:
https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service#Team