[Wiki-research-l] Research on automatically created articles

9 Aug 2016

Hi all,

I found a paper at IJCAI 2016, which left me quite curious:
https://siddbanpsu.github.io/publications/ijcai16-banerjee.pdf

In short, they find red links, classify them, find the closest similar
articles, use the section titles from these articles to decide on sections,
search for content for the sections, paraphrase it, and write complete
Wikipedia articles.

Then they uploaded the articles to Wikipedia, and from the 50 uploaded
articles, only 3 got deleted. The rest stayed. I was rather excited when I
heard that - where the articles really that good?

Then I took a look at the articles and... well, judge for yourself. The
paper only mentions three articles of the 47 survivors:

https://en.wikipedia.org/wiki/Dick_Barbour

https://en.wikipedia.org/wiki/Atripliceae (here is the last version as
created by the bot before significant human clean-up:
https://en.wikipedia.org/w/index.php?title=Atripliceae&oldid=697456858 )

https://en.wikipedia.org/wiki/Talonid

I have connected with the first author and he promised me to give a list of
all articles as soon as he can get it, which will be in a few weeks because
he is away from his university computer right now. He was able to produce
one more article though:

https://en.wikipedia.org/wiki/Sonia_Bianchetti_Garbato

(Also, see history for the extent of human clean-up)

I am not writing to talk badly about the authors or about the reviewing
practice at IJCAI, or about the state of research in that area. Also, I
really do not want to discourage research in this area.

I have a few questions, though:

1) the fact that so many of these articles have survived for half a year
indicates that there are some problems with our review processes. Does
someone want to make an investigation why these articles survived in the
given state?

2) as far as I know we don't have rules for this kind of experiments, but
maybe we should. In particular, I feel, that, BLPs should not be created by
an experimental approach like this one. Should we set up rules for this
kind of experiments?

3) Wikipedia contributors are participating in these experiments without
consent. I find that worrysome, and would like to hear what others think.

I have invited the first author to join this list.

I understand the motivation: by exposing from the beginning that these
articles were created by bots, they would have been scrutinized differently
than articles written by humans. Therefore they remained quiet about the
fact (but are willing to reveal it now, now that the experiment is over -
they also explicitly don't have any intentions of expanding the scope of
the experiment at the given point of time).

Cheers,
Denny

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

[Wiki-research-l] Research on automatically created articles