Hi developers,
This is K. Kaushik Reddy from India. I'm keen into Data science & good with
python. I am looking to participate in GSoC'19 for your organisation. I
also have gone through the ideas page for GSoC'19 , but looking for few
data science projects with python as the code base this year.
In future, I'm planning to take Research & Development as my career. I
always wanted to know how you people plan and execute a project in action.
Secondly, Could you please guide me with relevant bugs that need to be
fixed, so as to get involved in GSoC'19 along with you guys.
Hoping to learning more and have fun with you people.
Best,
K. Kaushik Reddy.
IC2S2 2019 – 5th INTERNATIONAL CONFERENCE ON COMPUTATIONAL SOCIAL SCIENCE
July 17-20, 2019
University of Amsterdam, The Netherlands
Web:https://2019.ic2s2.org/
Twitter:@IC2S2 <https://twitter.com/IC2S2>/ #IC2S2
Deadline for abstracts: 5 February 2019
This summer the fifth edition of IC2S2 is hosted by the University of
Amsterdam, The Netherlands. IC2S2 brings together researchers in
computational science, complexity, and social science, and provides a
platform for new work in the field of computational social science.
Following a day of tutorials on July 17, contributed abstracts are
presented orally in parallel thematic sessions or as posters at the
three day conference, which takes place at the University of Amsterdam
in the Netherlands from July 18 to 20.
================================================
KEYNOTE SPEAKERS
================================================
Kenneth Benoit (London School of Economics and Political Science)
Jana Diesner (University of Illinois at Urbana Champaign)
Deen Freelon (University of North Carolina)
Mirta Galesic (Santa Fe Institute)
Cesar Hidalgo (MIT Media Lab)
Devra Moehler (Facebook Research)
Scott Page (University of Michigan)
Leto Peel (Université Catholique de Louvain)
Emma Spiro (University of Washington)
Lukas Vermeer (Booking.com)
Claudia Wagner (GESIS Leibniz Institute for the Social Sciences)
================================================
SPECIAL SESSION “5 YEARS OF IC2S2"
================================================
After an opening talk by Duncan Watts (Microsoft Research), several
renowned computational social scientists from the field will engage in a
panel discussion on the future of the field of computational social
science. Among confirmed panelists are:
Ulrik Brandes (ETH Zürich)
Sandra González-Bailón (University of Pennsylvania)
Helen Margetts (University of Oxford, The Alan Turing Institute)Markus
Strohmaier (RWTH Aachen University)
Duncan Watts (Microsoft Research)
===================================================
CALL FOR PAPERS (ABSTRACTS)
===================================================
IC2S2 solicits abstracts from researchers in the social sciences with a
clear component of computation, simulation or data analysis or data
science. This includes for example sociology, psychology, communication
science, anthropology, media studies, political science, public health,
and economics. In addition, contributions from computer science, data
science, and computational science with real-world applications in the
social sciences or related fields, are welcome. We emphatically welcome
abstracts that try to integrate both components. This is not limited to
empirical studies; more general theoretical contributions are also welcome.
Topics of interest include, but are not limited to, the following:
*
Network analysis of social systems
*
Large-scale social experiments
*
Agent-based or other simulations of social phenomena
*
Text analysis and natural language processing (NLP) of social phenomena
*
Cultural patterns and dynamics
*
Computational science studies (sociology of science)
*
Social news curation and collaborative filtering
*
Social media studies
*
Theoretical discussions in computational social science
*
Causal inference and computational methods for social science
*
Ethics in computational social sciences
*
Reproducibility in computational social science
*
Large scale infrastructure in computational social science
*
Novel digital data sources
*
Computational analyses for addressing societal challenges
*
Methods and analyses of observational social data
*
Computational social science research in industry
Contributions to the conference should be submitted via EasyChair
at:https://easychair.org/conferences/?conf=ic2s2-2019
<https://easychair.org/conferences/?conf=ic2s2-2019>
Please follow the extended abstract template guidelines for Word
(ic2s2-word-template.docx
<https://2019.ic2s2.org/wp-content/uploads/2018/12/ic2s2-word-template.docx>)
and LaTeX (ic2s2-latex-template.zip
<https://2019.ic2s2.org/wp-content/uploads/2018/12/ic2s2-latex-template.zip>)
for formatting instructions. Note that abstracts should be submitted as
a PDFfile no larger than 20MB. Submissions that exceed the 2-page limit
(including figures and references) will be automatically rejected.
The extended abstract should include a title and a list of 5 keywords,
but no authors’ names or affiliations. The abstract should outline the
main contribution, data and methods used, results, and the impact of the
work. Authors are encouraged to include one figure in their submission
(the figure counts towards the page limit).
Please do not include authors’ names and affiliations in the submitted
document, as peer review will be double blind. Each extended abstract
will be reviewed by multiple members of the Program Committee, composed
of experts in computational social science.
When submitting on EasyChair you will be asked to provide information
about the authors and their affiliations and to include a one-sentence
summaryof the extended abstract (20-50 words). The summary will be used
for assigning reviewers. You can indicate a preference for an oral
presentation or a poster presentation, but your preference may not be
honored in the final decision.
Submissions will be non-archival, and the presented work can be already
published, in preparation for publication elsewhere, or ongoing
research. Submission implies willingness to present a talk or poster at
the conference.
The deadline for abstract submission is February 5, 2019.
====================================================
IMPORTANT DATES
====================================================
*
February 5, 2019 – Regular abstract submission deadline
*
March 6, 2019 – Acceptance notification
*
April 30, 2019 – Early bird registration deadline
*
June 30, 2019 – Registration deadline
*
July 17-20, 2019 – Conference
=====================================================
SPONSORSHIP
=====================================================
Do you want to affiliate your organisation to IC2S2? Bespoke sponsorship
packages are available on request. Contact the local chairs Eelke
Heemskerk and Frank Takes at ic2s2(a)uva.nl
IC2S2 2019 is hosted byCORPNET <https://corpnet.uva.nl/>, University of
Amsterdam and so far it is sponsored by Microsoft Research, Statistics
Netherlands, ODISSEI, University of Amsterdam, Leiden University,
Municipality of Amsterdam, Amsterdam Centre for Inequality Studies and
SAGE Publishing.
Please watch the video “IC2S2 comes to Amsterdam
<https://www.youtube.com/watch?v=TUWvAysyqPk&t=2s>”.
Web:https://2019.ic2s2.org/ Twitter:@IC2S2
<https://twitter.com/IC2S2>/ #IC2S2
Hello fellow wiki investigators!
I have observed that, very often in wikis, users not in the bot groups are
actually behaving like bots. Since the mediawiki api doesn't restrict
normal users to automatize tasks through its API, you might have a "normal"
user, actually doing bot things. I would like to identify those and
consider them as bots.
Is anyone aware if there's any implemented model already to classify
whether an user is a bot or not?
Thanks and nice weekend!
--
Saludos,
Abel.
"Anonymous calling: The WikiScanner scandals and anonymity on the
Japanese Wikipedia" from yesterday's call for reviews was very good; I
like this quote:
> “If there is a user ID attached to a user, discussion tends to become a criticizing game. On the other hand, under an anonymous system, even though your opinion/information is criticized, you don’t know with whom to be upset. Also, with a user ID, those who participate in the site for a long time end to have authority, and it becomes difficult for a user to disagree with them. Under a perfectly anonymous system, you can say ‘it’s boring’, if it is actually boring. All information is treated equally; only an accurate argument will work.” [Andrew Lih. The Wikipedia revolution: How a bunch of nobodies created the world’s greatest encyclopedia. New York: Hyperion, 2009, p. 145.]
Most of the research on the ways anonymity helps productivity went on
during the design of double-blind experiments, and I hope more of that
is done in the context of Wikipedia.
I am too busy this month to do any reviews, and will be disappointed
if someone else doesn't:
https://firstmonday.org/ojs/index.php/fm/article/view/9184/7694
Thanks!
Best regards,
Jim
Hi all,
As some of you know, we started a line of research back in 2016 to
understand Wikipedia readers better. We published the first taxonomy
of Wikipedia readers and we studied and characterized the reader types
in English Wikipedia [1]. During the past 1+ year, we focused on
learning about the potential differences of Wikipedia readers across
languages based on the taxonomy built in [1]. We've learned a lot, and
today we're sharing what we learnt with you.
Some pointers:
* Publication: https://arxiv.org/abs/1812.00474
* Data: https://figshare.com/articles/Why_the_World_Reads_Wikipedia/7579937/1
* (under continuous improvement) Research page on meta:
https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Reader_Be…
* Research showcase presentation:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#December_2018
* A series of presentations to WMF teams and community: Look for tasks
under https://phabricator.wikimedia.org/T201699 with title "Present
the results of WtWRW" for link to slides and more info when available.
* We will send out a blog post about it hopefully soon. A blog post
about the intermediate results is at
https://wikimediafoundation.org/2018/03/15/why-the-world-reads-wikipedia/
In a nutshell:
* We ran the taxonomy of Wikipedia readers in 14 languages and
measured the prevalence of Wikipedia use-cases and characterized
Wikipedia readers in these languages.
* While we observe similarities in terms of the prevalence of the use
cases as well as the way we can characterize readers, we can see that
Wikipedia languages lend themselves to different distributions of
readership and characteristics. In many cases, the one-size-fit-all
solutions may simply not work for readers.
* Intrinsic learning remains as the number one motivation for people
to come to Wikipedia in the majority of the languages, followed by
media.
* In-depth reading and the reading of scientific oriented topics is
highly and negatively correlated with the socio-economic status and
Human Development Index of countries the readers in these languages
are coming from. Long articles that may seem just too long for the
bulk of our audience in US, Japan, and the Netherlands is in high
demand in India, Bolivia, Argentina, Panamá, México, …
* ...
This research was not possible without the extensive contributions by
our formal collaborators: Florian Lemmerich (RWTH Aachen University)
and Bob West (EPFL). On the WMF end, I was fortunate to work with
Diego Saez on this project as well as more recently, Isaac Johnson.
And all those in the Reading Web and Legal team who supported us
throughout the process. I also want to underline the amazing work that
the volunteers in the languages in the study did to support us heavily
to learn more about their languages, not only through help with
communications within their communities but also with the translation
task which was not an easy one as they were asked to offer their time
not only to translate but also do in-person meetings with us for us to
make sure the intent of the question is translated the same way across
the languages. Usernames Strainu, Tgr, Amire80, Awossink, Antanana,
Lyzzy, Shangkuanlc, Whym, Kaganer, عباد_ديرانية, Satdeep_Gill, Racso,
Hasive: Thank you!
Next we are going to extend this study to include demographics
information. More information about it coming out in the next few
weeks. (And I will send out a separate email to wikimedia-l about this
topic and future research over the weekend. I need some time to
finalize the message to make the message most useful for that
audience.:)
Best,
Leila
[1] https://arxiv.org/abs/1702.05379
--
Leila Zia
Senior Research Scientist, Lead
Wikimedia Foundation
Hi everyone,
We’re preparing for the January 2019 research newsletter and looking for contributors. Please take a look at https://etherpad.wikimedia.org/p/WRN201901 and add your name next to any paper you are interested in covering. Our target publication date is on January 31 UTC although actual publication might happen several days later. As usual, short notes and one-paragraph reviews are most welcome.
Highlights from this month:
• "Anonymous calling": The WikiScanner scandals and anonymity on the Japanese Wikipedia
• A Historical Perspective on Information Systems: A Tool and Methodology for Studying the Evolution of Social Representations on Wikipedia
• An inferior source? Quantitatively analyzing the production and revision of five technology-enhanced learning-related terms on Wikipedia
• Asthma Information Seeking via Wikipedia Between 2015 and 2018: Implications for Awareness Promotion
• Can deep learning techniques improve classification performance of vandalism detection in Wikipedia?
• Feature Analysis for Assessing the Quality of Wikipedia Articles through Supervised Classification
• Following the Fukushima Disaster on (and against) Wikipedia: A Methodological Note about STS Research and Online Platforms
• Hacking History: Redressing Gender Inequities on Wikipedia Through an Editathon
• Improving New Editor Retention on Wikipedia
• Let’s Talk About Refugees: Network Effects Drive Contributor Attention to Wikipedia Articles About Migration-Related Topics
• Pharmacy students can improve access to quality medicines information by editing Wikipedia articles
• The digital knowledge economy index: mapping content production
• Towards Compiling Textbooks from Wikipedia
• Trolls, bans and reverts: simulating Wikipedia
• Why the World Reads Wikipedia: Beyond English Speakers
• Wizard of Wikipedia: Knowledge-Powered Conversational agents
Masssly, Tilman Bayer and Dario Taraborelli
[1] http://meta.wikimedia.org/wiki/Research:Newsletter
Hello John,
In the domain of natural language generation, there has been some work
around generating text from Wikidata triples (mostly for Wikipedia):
In under-resourced languages, domain-independent:
https://link.springer.com/chapter/10.1007/978-3-319-93417-4_21
In under-resourced languages, domain-independent:
http://aclweb.org/anthology/N18-2101
In English, for the biography domain (Wikidata and DBpedia):
https://www.sciencedirect.com/science/article/pii/S1570826818300313
In English, for the biography domain:
https://arxiv.org/abs/1702.06235
Since we worked in this field, please reach out if you have questions
regarding those publications.
Best,
Lucie
On Wed, 23 Jan 2019 at 13:58, john cummings <mrjohncummings(a)gmail.com>
wrote:
> Hi all
>
> I'm putting together an overview of Wikimedia for a conference submission
> and their theme is AI. Does anyone have any examples of use of Wikimedia by
> AI projects?
>
> Thanks
>
> John
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l(a)lists.wikimedia.org
>
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.wi…
>
--
Lucie-Aimée Kaffee
Web and Internet Science Group
School of Electronics and Computer Science
University of Southampton
Hi all
I'm putting together an overview of Wikimedia for a conference submission
and their theme is AI. Does anyone have any examples of use of Wikimedia by
AI projects?
Thanks
John
(sorry for cross-posting on wiki-research and wikidata)
For an event, I am trying for find statistics about the gender gap. At
one point there was a nice website with some information. Now it seems
to be gone.
https://denelezh.dicare.org/gender-gap.php
redirects to https://wdcm.wmflabs.org/WDCM_BiasesDashboard/ but I get
"502 Bad Gateway"
For http://whgi.wmflabs.org/gender-by-date-of-birth.html I see changes
statistics on a web page. (when I view this page it seems as if the CSS
is missing)
I have trying Wikidata Query Service. I have a few results here:
https://www.wikidata.org/wiki/User:Fnielsen/Gender
The bad news is that more complex SPARQL queries time out, e.g. Persons
across Wikipedias and genders, - I can only do it on Wikisource and
Wikiquote.
For instance, I find the ratio of female biographies on the Danish
Wikipedia to be 17.2% compared to the total number of biographies. It
would be interesting to know how this number compares with other
Wikipedias. If I include Wikipedia as a parameter in the SPARQL query it
times out. (Writing a script could possibly solve it).
For WHGI, I see som CSV files. For instance,
https://figshare.com/articles/Wikidata_Human_Gender_Indicators/3100903
It reports 9447 and 51774 for women and men, respectively. My SPARQL
query does not give these values...
Do we have updated statistics on contributor gender, particularly for
Danish Wikipedia? I know we have some papers on gender and Wikipedia,
see https://tools.wmflabs.org/scholia/topic/Q17002416
"Gender Markers in Wikipedia Usernames" displays 4,6% or 1.2% for
females, while "The Wikipedia Gender Gap Revisited: Characterizing
Survey Response Bias with Propensity Score Estimation" estimates up to
23%. The old UNU survey
https://web.archive.org/web/20110728182835/http://www.wikipediastudy.org/do…
seems not to do gender statistics per Wikipedia.
https://stats.wikimedia.org seems not to have any gender statistics.
--
Finn Årup Nielsen
http://people.compute.dtu.dk/faan/
Hi all,
We’re thrilled to announce the release of WikiConv—a multilingual corpus
reconstructing the complete conversational history of multiple Wikipedia
language editions
<https://github.com/conversationai/wikidetox/tree/master/wikiconv>.
The corpus—a collaboration between Jigsaw, Cornell and Wikimedia
foundation—includes over 100M individual conversation threads and 300M
conversational actions extracted from the English, Chinese, German, Greek,
and Russian Wikipedia talk pages.
WikiConv can be used to understand and model conversational turns in online
collaborative spaces, as we showed in an earlier study, predicting when
conversations go awry <https://arxiv.org/abs/1805.05345>.
The reconstruction methodology, as well as its possible applications, are
described in a paper by Hua et al. recently presented at EMNLP 2018
<https://arxiv.org/abs/1810.13181>. You can also watch a video presentation
of this work from the Wikimedia Research showcase
<https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#June_2018> in
June 2018.
The corpus is released under CC0 (CC BY SA for individual comments). All
the underlying code is available in this Github repository
<https://github.com/conversationai/wikidetox/tree/master/wikiconv>.
If you have any questions about the dataset, feel free to contact us at
yiqing(a)cs.cornell.edu.
Best,
Yiqing