Hello All,
I have created a new labelling campaign, "Newcomer Session Quality" which
describes observes the *sessions* (temporally related edits) of users on
their *registration day* [1]. The idea is to detect potentially productive
newcomers may be struggling with onboarding and might otherwise get bitten.
I was wondering if there was a standard way to advertise for volunteers for
a new campaign on-wiki? Would it be OK to leave talk-page invitations to
users who have signed up for updates on other campaigns [2]? By the way,
right now the campaign is just in enwiki for testing, but I have 6 other
languages ready to go if enwiki labelling goes well.
[1] https://en.wikipedia.org/wiki/Wikipedia:Labels/Newcomer_session_quality
[2] https://en.wikipedia.org/wiki/Wikipedia:Labels/Edit_quality
Make a great day,
Max Klein ‽ http://notconfusing.com/
The 2nd International Conference on Applications of Intelligent Systems,
APPIS 2019 <http://appis.webhosting.rug.nl/2019/>, will be held on 7-12
January 2019 in Las Palmas de Gran Canaria, Spain.
APPIS 2019 is organized by the University of Groningen and the
University of Las Palmas de Gran Canaria, and includes a Winter School
on Machine Learning (WISMAL 2019)
<http://appis.webhosting.rug.nl/2019/tutorials-appis-2019/>.
APPIS 2019 welcomes submission of *abstracts* (1-2 pages) and*full
papers* (4-6 pages) related, but not limited to the following topics:
* Machine learning and representation learning
* Images, videos and time-series analysis
* Statistical and structural pattern recognition
* Data visualization and dimensionality reduction
* Robotics
* Intelligent systems in health and medicine
* Cyber computing and security
* Bio-informatics
* Data mining
* Cognitive discovery
* Algorithms for embedded and real-time systems
* Semantic technologies
* Intelligent buildings
* Intelligent sensors and sensor networks
* Augmented reality
* Adaptive systems
* Fuzzy systems
* Human-machine interaction
* Natural language processing
* Situation awareness systems
* Recommender systems
Papers must be submitted electronically, with *deadline October 26th*,
through the APPIS 2019 conference web site in pdf format and must
conform to the ACM template and style file. Proceedings will be
published in ACM ICPS.
Each accepted paper must be presented by one of the authors and
accompanied by at least one full registration fee payment (250 Euro), to
guarantee publication in the proceedings.
In order to be published in the conference proceedings, *abstract
submissions* need to be extended to full papers before the conference
(*deadline January 5, 2019*)
Find more information in the submission page
<http://appis.webhosting.rug.nl/2019/paper-submission/>.
We look forward to meet you in Las Palmas de Gran Canaria!
Best regards,
Nicolai Petkov
Nicola Strisciuglio
Carlos Travieso-Gonzalez
Hey folks,
I wanted to ask a question of people on this list. Up until now, I have
been rejecting postings from non-list members that were just blind Calls
for Papers (CFPs) for various journals and conferences. It seems to me
that we all get plenty of spam like this and that we don't need to also
receive it via our mailing lists.
Should I continue to block this content?
FWIW, I don't see any problem with our list member sharing CFPs for
conferences/journals that they think are important. That should be allowed
through. I've just been rejecting the drive-by non-member CFP postings.
-Aaron
There is a new paper out about "Using the Tsetlin Machine to Learn Human -
Interpretable Rules for High - Accuracy Text Categorization with Medical
Applications" [1] or in our context "…High - Accuracy Text Categorization
of unsourced statements".
Their results on text categorization is quite promising. I've been
wondering why they get so good results, and I suspects it either has to do
with implicit regularization (kind of dropouts), or some other effects I
suspect can be important when you start comparing really good results. One
is that learning binary weights uses less information entropy than learning
weights with higher quantization (more bits), thus with limited training
data more of the available information entropy goes into learning the
actual rules. Another possibility is that the learning algorithm finds
better minimums (actually maximums) than the other algorithms. Ie the
algorithm find stable solutions, that is the real minimum. A third
possibility is that the learning is faster because it does not backprop
(thus more stable and converge faster).
The generated rules are much easier to handle in the users own browser, and
instead of using a central server the text categorization (classification)
can be done in the users own browser. That will make the interaction more
responsive.
In my opinion this is a neural network. The generated rules can be
reformulated as a disjunctiove normal forms, and then it is more obvious.
There are binary weights, weight multiplication done with and-operators,
and summation done by or-operators.
There are more background in the paper "The Tsetlin Machine - A Game
Theoretic Bandit Driven Approach to Optimal Pattern Recognition with
Propositional Logic"
[1] https://arxiv.org/abs/1809.04547
[2] https://arxiv.org/abs/1804.01508
John Erling Blad
/jeblad
The 2nd International Conference on Applications of Intelligent Systems,
*APPIS 2019* <http://appis.webhosting.rug.nl/2019/> will be held on
*7-12 January 2019* in *Las Palmas de Gran Canaria*, Spain.
APPIS 2019 is organized by the University of Groningen and the
University of Las Palmas de Gran Canaria, and includes a *Winter School
on Machine Learning (WISMAL 2019)*.
APPIS 2019 welcomes (but is not limited to) contributions related to the
following topics:
Images, videos and time-series analysis
Machine learning and representation learning
Statistical and structural pattern recognition
Data visualization and dimensionality reduction
Robotics
Intelligent systems in health and medicine
Cyber computing and security
Bio-informatics
Data mining
Cognitive discovery
Algorithms for embedded and real-time systems
Semantic technologies
Intelligent buildings
Intelligent sensors and sensor networks
Augmented reality
Adaptive systems
Fuzzy systems
Human-machine interaction
Natural language processing
Situation awareness systems
Recommender systems
=============================================================
Conference proceedings will be published by *ACM International
Conference Proceedings Series.*
The registration fee includes conference materials, tutorials,
conference dinner, welcome reception, another social event (a guided
tour of old town Las Palmas or a visit to the only coffee plantage in
the EU) and daily coffee breaks. The prices are:
Early registration: *250 Euro*
Late registration: *325 Euro*
The costs for each additional paper are *100 Euro.
*=============================================================|
*Winter School on Machine Learning - WISMAL 2019*
APPIS includes a short winter school consists of several tutorials that
present different techniques of Machine Learning. Please find more
information at the WISMAL 2019 page
<http://appis.webhosting.rug.nl/2019/tutorials-appis-2019/>.
The participation in the winter school is free of charge for registered
participants in APPIS 2019. The number of participants in the winter
school is limited to 100 and early registration is encouraged.
=============================================================
Paper submission – *Oct 26, 2018*
Paper acceptance notification – *Dec 1, 2018*
Early registration – *Dec 8, 2018*
Camera ready – *Dec 15, 2018*
Conference – *7-9 Jan 2019*
Winter school on Machine learning – *10-12 Jan 2019
*Please find more information on the conference website
<http://appis.webhosting.rug.nl/2019/>.*
*the conference co-chairs
/Nicolai Petkov//
//Nicola Strisciuglio//
//Carlos Travieso-Gonzalez//
/
I don't have enough knowledge about neural nets to evaluate the email
below, but I'm forwarding it in case it's of interest to others on two
relevant lists.
Pine
( https://meta.wikimedia.org/wiki/User:Pine )
---------- Forwarded message ---------
From: John Erling Blad <jeblad(a)gmail.com>
Date: Wed, Sep 26, 2018 at 6:23 PM
Subject: [Wikimedia-l] Captioning Wikidata items?
To: Wikimedia Mailing List <wikimedia-l(a)lists.wikimedia.org>
Just a weird idea.
It is very interesting how neural nets can caption images. Quite
interesting. It is done by building a state-model of the image, that is
feed into a kind of neural net (RNN) and that net (a black box) will
transform the state-model into running text. In some cases the neural net
is steered. That is called an attention control, and it creates
relationship between parts in the image.
Swap out the image wit an item, and a virtually identical setup can
generate captions for items. The caption for an item is whats called the
description in Wikidata. It is also the first sentence with a lead-in in
Wikipedia articles. It is possible to steer the attention, that is to tell
the network what items should be used, and thus the later sentences will be
meaningful.
What that means is that we could create meaningful stub entries for the
article placeholder, that is the "AboutTopic" special page. We can't
automate this for very small projects, but somewhere between small and mid
sized languages it will start to make sense.
To make this work we need some very special knowledge, which we probably
don't have, like how to turn an item into a state-model by using the highly
specialized rdf2vec algorithm (hello Copenhagen) and verifying the stateful
language model (hello Helsinki and Tromsø).
I wonder if the only real problems are what do the community want, and what
is the acceptable error limit.
John Erling Blad
/jeblad
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l(a)lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Hi, we'll have to deprecate both usages due to how the software is
configured.
If migrating the clients turns out to be a pain for some reason, we can
negotiate the deprecation date, of course…
-Adam
On Mon, Aug 27, 2018 at 1:57 PM Ryan Kaldari <rkaldari(a)wikimedia.org> wrote:
> Thanks for the update (and the switch to a less confusing score name)! Just
> to clarify, when you say "we might pull the plug after four weeks after
> this announcement," does that refer to wp10 just in the generic scores
> request (e.g. https://ores.wikimedia.org/v3/scores/enwiki/855137823) or
> also in the specific scores request (e.g.
> https://ores.wikimedia.org/v3/scores/enwiki/855137823/wp10)? In other
> words, will https://ores.wikimedia.org/v3/scores/enwiki/855137823/wp10
> still work a month from now, or does that also need to be migrated to
> "articlequality"?
>
> On Mon, Aug 27, 2018 at 1:50 PM Amir Sarabadani <
> amir.sarabadani(a)wikimedia.de> wrote:
>
> > Hello,
> > If you don't use ORES API, please ignore this email.
> >
> > If you are using wp10 models in your tool, gadget, or research, please
> note
> > that these models are now renamed to "articlequality" to better reflect
> > what they are (in comparison to "editquality"). articlequality models are
> > deployed on English, Russian, French, Persian, Turkish, and Basque
> > Wikipedia languages.
> >
> > So URLs like this:
> > https://ores.wikimedia.org/v3/scores/enwiki/855137823/wp10
> > Need to be changed to something like this:
> > https://ores.wikimedia.org/v3/scores/enwiki/855137823/articlequality
> >
> > Same goes with parsing the results.
> >
> > The "wp10" still exists as an alias and if you don't determine models
> > (meaning you want scores for all models) we respond with wp10 and
> > articlequality data duplicated [1] but we might pull the plug after four
> > weeks after this announcement.
> >
> > [1]: For example see:
> > https://ores.wikimedia.org/v3/scores/enwiki/855137823
> >
> > For more information see: https://phabricator.wikimedia.org/T196240
> >
> > Best
> > --
> > Amir Sarabadani
> > Software Engineer
> >
> > Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
> > Tel. (030) 219 158 26-0
> > http://wikimedia.de
> >
> > Stellen Sie sich eine Welt vor, in der jeder Mensch an der Menge allen
> > Wissens frei teilhaben kann. Helfen Sie uns dabei!
> > http://spenden.wikimedia.de/
> >
> > Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V.
> > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter
> > der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
> > Körperschaften I Berlin, Steuernummer 27/029/42207.
> > _______________________________________________
> > Wikitech-l mailing list
> > Wikitech-l(a)lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Good morning/afternoon/evening everyone,
If you are an editor of the French, Italian or English Wikipedia, and you
are curious about how to contribute to technologies for improving
verifiability of Wikipedia articles, please read on—we need your help!
In the context of the Knowledge integrity
<https://meta.wikimedia.org/wiki/Knowledge_Integrity> program, we (the
WMF Research
team <http://research.wikimedia.org>) are studying ways to flag unsourced
statements needing a citation
<https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statem…>
using machine learning, with the aim of identifying areas where adding high
quality citations is particularly urgent or important. Following the
success of the first labeling campaign
<https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statem…>,
we now need to collect additional, high-quality labeled data regarding
why sentences
need citations.
You are invited to participate in a second annotation task
<https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statem…>.
We used your input from the last experiment to generate a taxonomy of
reasons
<https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statem…>
why editors add citations. With this taxonomy now embedded in the
interface, the annotation experience will be much faster and fun.
If you are interested in participating, please go to
http://labels.wmflabs.org/ui/enwiki/ (replace enwiki with itwki or frwiki
if you speak Italian or French), login, and from *'**Labeling Unsourced
Statements II’**,* request one (or more) workset. For each task in a
workset, the tool will show you an unsourced sentence in an article and ask
you to annotate it. You can then label the sentence as needing an inline
citation or not, and specify a reason for your choice from a drop-down
menu. If you can't respond please select 'skip'. You can also sign up by
(optionally) adding your name on this page
<https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statem…>
to receive updates about future campaigns and results from this research
If you have any question/comment on this project, please let us know by
contacting miriam(a)wikimedia.org or leaving a message on the talk page of
the project
<https://meta.wikimedia.org/wiki/Research_talk:Identification_of_Unsourced_S…>.
Thank you for your time!
Miriam, Jonathan, and Dario
--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
Forwarding.
Pine
( https://meta.wikimedia.org/wiki/User:Pine )
---------- Forwarded message ---------
From: Sarah R <srodlund(a)wikimedia.org>
Date: Fri, Aug 10, 2018 at 10:46 PM
Subject: [Analytics] Wikimedia Research Showcase August 13 2018 at 11:30 AM
(PDT) 18:30 UTC
To: <wikimedia-l(a)lists.wikimedia.org>, <wiki-research-l(a)lists.wikimedia.org>,
<analytics(a)lists.wikimedia.org>
Hi Everyone,
The next Wikimedia Research Showcase will be live-streamed Wednesday,
August 13 2018 at 11:30 AM (PDT) 18:30 UTC.
YouTube stream: https://www.youtube.com/watch?v=OGPMS4YGDMk
As usual, you can join the conversation on IRC at #wikimedia-research. And,
you can watch our past research showcases here.
<https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#Upcoming_Showcase>
Hope to see you there!
This month's presentations is:
*Quicksilver: Training an ML system to generate draft Wikipedia articles
and Wikidata entries simultaneously*
John Bohannon and Vedant Dharnidharka, Primer
The automatic generation and updating of Wikipedia articles is usually
approached as a multi-document summarization task: Given a set of source
documents containing information about an entity, summarize the entity.
Purely sequence-to-sequence neural models can pull that off, but getting
enough data to train them is a challenge. Wikipedia articles and their
reference documents can be used for training, as was recently done
<https://arxiv.org/abs/1801.10198> by a team at Google AI. But how do you
find new source documents for new entities? And besides having humans read
all of the source documents, how do you fact-check the output? What is
needed is a self-updating knowledge base that learns jointly with a
summarization model, keeping track of data provenance. Lucky for us, the
world’s most comprehensive public encyclopedia is tightly coupled with
Wikidata, the world’s most comprehensive public knowledge base. We have
built a system called Quicksilver uses them both.
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics