Hello,
if I remember correctly there was some interesting research discussed
here about citations to articles cited on Wikipedia. Can someone help
me find that work?
It is hard to google for, because I keep turning up research on citing
Wikipedia itself, which is not what I'm after. The question is: "Are
articles that appear in Wikipedia citations preferentially cited when
compared with other comparable articles?"
This is related to the so-called "Matthew Effect" and along these
lines I found a recent article that says "controlling for field and
impact factor, the odds that an open access journal is referenced on
the English Wikipedia are 47% higher compared to closed access
journals." http://arxiv.org/abs/1506.07608
However, I'm interested in the "downstream" side, not the "upstream" side.
TIA.
Joe
--
RMS: "I am not on vacation, but I am at the end of a long time delay.
I am located somewhere on Earth, but as far as responding to email is
concerned, I appear to be well outside the solar system."
Hi everybody,
We’re preparing for the August 2015 research newsletter and looking for contributors. Please take a look at: https://etherpad.wikimedia.org/p/WRN201508 and add your name next to any paper you are interested in covering. We encourage reviews of individual papers/posters from the just ended OpenSym 2015 conference. Our target publication date is Wednesday August 26 UTC. As usual, short notes and one-paragraph reviews are most welcome.
Highlights from this month:
An Analysis of the Application of Wikipedia Corpus on the Lexical Learning in the Second Language Acquisition
Cultural Anthropology through the Lens of Wikipedia: Historical Leader Networks, Gender Bias, and News-based Sentiment
Wikipedia in Education: Acculturation and learning in virtual communities
Exploiting Wikipedia for Information Retrieval Tasks
Analysing Wiki Quality using Probabilistic Model Checking
An Exploratory Study of Free / Libre / Open Source Software Organizations
WikiSERM: Wikipedia Vandalism Detection Through Sequential Event Risk Measure
Enabling Complex Wikipedia Queries - Technical Report
Policies For The Production Of Contents In The Wikipedia, The Free Encyclopedia
Validity Claims Of Information In Face Of Authority Of The Argumention Wikipedia
What Works Well With Wikidata?
Management and the Future of Open Collaboration
GLAWI, a free XML-encoded Machine-Readable Dictionary built from the French Wiktionary
DBpedia Commons: Structured Multimedia Metadata from the Wikimedia Commons
Automatically Expanding the Synonym Set of SNOMED CT using Wikipedia
Content Volatility of Scientific Topics in Wikipedia: A Cautionary Tale
Automated News Suggestions for Populating Wikipedia Entity Pages
A Comparative Survey of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO
Networked Knowledge: Approaches to Analyzing Dynamic Networks of Knowledge in Wikis for Mass Collaboration
IWNLP: Inverse Wiktionary for Natural Language
Detecting Vandalism on Wikipedia across Multiple Languages
If you have any question about the format or process feel free to get in touch off-list.
Masssly, Tilman Bayer and Dario Taraborelli
[1] http://meta.wikimedia.org/wiki/Research:Newsletter
== Apologies for Crossposting ==
Below you will find useful information and links about the program, our
satellite events and registration opportunities for
SEMANTiCS 2015 -- 11th International Conference on Semantic Systems
15-17 September 2015, Vienna / Austria
http://www.semantics.cc/ // #semantics2015 // #semanticsconf
*— Conference Scope ---*
The annual SEMANTiCS conference is the meeting place for professionals
who make semantic computing work, and understand its benefits and know
its limitations. Every year, SEMANTiCS attracts information managers,
IT-architects, software engineers, and researchers, from organisations
ranging from NPOs, universities, public administrations to the largest
companies in the world.
*— Conference Program ---*
The 2015 edition offers a rich program consisting of 5 keynotes, 24
scientific presentations, 30 industry talks, 38 posters and various
workshops and social events. For details please visit our program page:
http://www.semantics.cc/programme
*— Keynote Speakers --- *
·Jeanne Holms -- Chief Knowledge Architect at NASA
·Peter Mika -- Director Semantic Lab at Yahoo
·Oscar Orcho -- Associate Professor for Artifical Intelligence,
Universidad de Madrid
·Klaus Tochtermann -- Leibniz Information Center for Economics
·Sam Rehman -- CTO at EPAM Systems
*— Workshops & Satellite Events ---*
*MeetUp: SMART DATA SOLUTIONS*
An outlook into the world of data centric business, technologies and
innovations: Data is everywhere these days and efficient data management
is THE key factor for success in nearly all industries in the meantime.
McKinsey lists data as a key factor for production alongside with labor
and capital in one of their recent reports. Furthermore, Data is
produced in huge amounts by sensors, social networks or mobile devices
and the amounts of data available worldwide grow exponentially…
Place: Haus der Ingenieure, Eschenbachgasse 9, 1010 Wien
Date: 15.09.2015, Entrance: 18:30pm CET ; Start event: 19:30 - 22:30pm CET
*2nd International Workshop on Geospatial Linked Data*
In recent years, Semantic Web technologies have strengthened their
position in the areas of data and knowledge management. Standards for
organizing and querying semantic information, such as RDF(S) and SPARQL
are adopted by large academic communities, while corporate vendors adopt
semantic technologies to organize, expose, exchange and retrieve their
datasets as Linked Data.
Chairs: Alejandra Garcia-Rojas M. (Ontos AG), Robert Isele or Rene
Pietzsch, Jens Lehmann (AKSW, University of Leipzig)
Date: 15th of September 2015, 09.00 to 13.00 CEST
*The SEMANTIC EXPERIENCE Coffee & Cocktail LOUNGE*
Sponsored Event: Enjoy semantic technology inspiration in a relaxed
atmosphere. Refresh yourself with Viennese coffee or a cocktail and get
in touch with semantic industry experts.
Date: 15th of September 2015, 15.45 to 18.00 CEST
Room: LC Club Room
*Linked Data in Industry 4.0*
The overall goal of the workshop is to identify challenges and
limitations from the manufacturing engineering industry in the scope of
the mentioned design principles, and bring them together with experts
and solution approaches from the linked data community in the scope of
Industry 4.0.
Chairs: Thomas Moser (FH St. Pölten), Stefan Hupe (IoT Austria)
Date: 15th of September 2015, 14.00 to 17.00 CEST
*European Data Economy Workshop - Focus Data Value Chain & Big and Open
Data*
This workshop is to overview the state of the art in Europe regarding
Big and Open Data initiatives and its impact in the Europan economy and
benefits for theEuropean society. Representatives from the Big Data
Value Association, the annual European Data Forum and data related
projects will participate during the first session of the workshop.
Furthermore it gives information about the Austrian Big Data Study
carried out in 2014 by AIT and IDC
Chairs: Nelia Lasierra (STI Innsbruck), Martin Kaltenböck (Semantic Web
Company)
Date: 15th of September 2015, 09.00 to 13.00 CEST
*1st Workshop on Data Science: Methods, Technology and Applications
(DSci15)*
This workshop is meant as an opportunity to bring together researchers
and practitioners interested in data science to present their ideas and
discuss the most important scientific, technical and socio-economical
challenges of this emerging field.
Chairs: Bernhard Haslhofer (AIT - Austrian Institute of Technology),
Elena Simperl (Univ. Southampton), Rainer Stütz (AIT - Austrian
Institute of Technology), Ingo Feinerer (FH Wiener Neustadt)
Date: 15th of September 2015, 09.00 to 17.00 CEST
*Workshop on Linked Data Strategies - Commercialisation of Interlinked Data*
In this workshop, we will give several demos and concrete examples of
how Linked Data can be used by enterprises in various industries. The
workshop aims to give users and providers of Linked Data valuable
methods and best practices at hand, which help them to make profound
decisions in their Linked Data projects.
Chairs: Christian Dirschl (Wolters Kluwer), Andreas Blumauer (Semantic
Web Company), Tassilo Pellegrini (FH St. Pölten)
Date: 15th of September 2015, 14.00 to 15.30 CEST
*Hackathon on "The power of Linked Data in Agriculture and Food Safety"*
“Data+Need=Hack”, this is the idea of a hackathon that brings together
like-minded people to develop, in a short time frame, novel solutions to
problems around the theme “Agriculture and Food Safety”.
Chairs: Christian Blaschke (Semantic Web Company, Vienna), Stasinos
Konstantopoulos (Institute of Informatics & Telecommunications of the
NCSR Demokritos, Athens)
Date: 18th of September 2015, 10.00 to 16.00 CEST
*— Registration ---*
To register, please go to:
http://www.semantics.cc/registration
We are looking forward to meet you at SEMANTiCS 2015!
Hi all
I'm currently working as Wikimedian in Residence at UNESCO, does anyone
know of any work done to compare the goals of Wikimedia with other
organisations who work in education? My specific interest is comparisons of
Wikimedia and the United Nations and it's departments.
Thanks
John
I ran my first training session using the Visual Editor this morning and hit
what appeared to be a show-shopping bug. It appeared that the two new users
(thankfully I had only 2) could not create a citation. They found themselves
in an infinite loop of Save Page with Capcha when they tried to create a
citation.
By the end of the session, I managed to refine the bug to a combination of
"new user", "new article" (although created by me, not the new users), and
citations involving a live URL, duly reported at
https://en.wikipedia.org/wiki/Wikipedia:VisualEditor/Feedback#New_users_unab
le_to_create_citation_with_a_live_external_link_in_it
Ironically it first happened on their newly created User Pages where we were
practising our new Wikipedia skills because tackling "real articles". Then
on the "real articles" I had created earlier for them to use (a training
approach that has the benefit of not unleashing a horde of angry
watchlisters when they make some silly mistake, which occurs if you let new
people make their early edits on "popular articles"). (Spot the pattern,
both were new articles!).
Now if this had happened to a new user sitting at home, they would have been
stymied. Because I was there to hold their hands in a training setting, I
found a way around the problem by logging them in as me and we continued the
training session on that basis (but not an option to the user sitting at
home frustratingly typing in Capcha responses until they got frustrated and
walked away).
So, Aaron, it may be that your research on the impact of the VE was impacted
by this bug. I imagine that users affected would have eventually aborted the
edit as they were unable to save, unless by chance they were able to realise
that the problem was caused by their citation and either removed the
citation and just saved the text changes. It's hard to say what the
likelihood of a new user being affected is, as the problem seemed to relate
to the age of the article (I am autopatrolled so I don't think the new
articles would have any "might be dodgy" status flags on them, but I am not
familiar with how that side of things works).
Also, is this experiment (or one similar) currently running? It's just that
when we went into the Preferences of the two new user accounts to enable the
VE, one of them already had it enabled (yet I had seen both new user
accounts created in front of me a couple of minutes earlier), so there was
no possibility that this was anything other than a default setting for one
of the two users. I thought enabling the VE was normally strictly opt-in?
Kerry
To whome it may concern,
I know I already posted this mail to the list, however there is an
updated. The tool is online: http://david-strohmaier.com/QAE/index.php
An tutorial video is available here (the video does not contain all
features of the tool):
http://david-strohmaier.com/QAE/tutorialVideo.html
FYI: This version is only tested with google chrome so far. If you want
to perform edits with the tool, pls use your sandbox to try it out.
If you like the tool and you want to support our research, please do the
questionnaires below.
Thx to all who did it already!
My name is David Strohmaier and I'm doing my Master Thesis. We
implemented a Quality Assisted Editor for Wikipedia, that should help to
detect quality flaws quite quickly.
Right now we are in the middle of evaluating the tool.
If you would spend us 15 - 25 minutes of your time you would really
help us with the evaluation.
All you have to do is evaluate the quality of two wikipedia articles.*
How to do it?
1. Please read the Wikipedia:Featured article criteria before you start.
Each question in the surveys is tagged with a hint on which criteria it
is targeting.
Wikipedia:Featured article criteria:
<https://en.wikipedia.org/wiki/Wikipedia:Featured_article_criteria>https://en.wikipedia.org/wiki/Wikipedia:Featured_article_criteria
2. Open the first Wikipedia article (Doctor Phosphorus) and the
connected survey. It is important that you open the article of Doctor
Phosphorus with the link in the Mail, because a special article revision
is chosen!
Doctor Phosphorus:
<https://en.wikipedia.org/w/index.php?title=Doctor_Phosphorus&oldid=445748339>https://en.wikipedia.org/w/index.php?title=Doctor_Phosphorus&oldid=445748339
<http://goo.gl/forms/ESgoyfd8AD>http://goo.gl/forms/ESgoyfd8AD
3. Please fill out the survey. The questions concerning the whole
article should be answered based on the three sections you have read.
4. Open the second Wikipedia article (Moon) and the connected survey. It
is important that you open the article of Moon with the link in the
Mail, because a special article revision is chosen!
Moon:
<https://en.wikipedia.org/w/index.php?title=Moon&oldid=670521118>https://en.wikipedia.org/w/index.php?title=Moon&oldid=670521118
<http://goo.gl/forms/PnlbMtA1X3>http://goo.gl/forms/PnlbMtA1X3
5. Please fill out the survey. The questions concerning the whole
article should be answered based on the three sections you have read.
Thank you!
I*f you have any questions, pls contact me: david.strohmaier(a)gmx.at
*
Best regards, David Strohmaier
*
Kerry Raymond wrote:
>...
> we could use big data to try to pro-actively find patterns of undesirable behaviour....
I agree, including with the specific examples given, most if not all
of which could be implemented within a general accuracy review system.
Please consider the three-level (actually four-level, if you count the
community) review system depicted at http://i.imgur.com/NhvyfXc.png
It can be gamed to secure an unfair advantage if the reviewer
reputation database (which includes reviewer identity) is open, but if
if the identities, reputation measurements, and algorithms are kept
confidential, then it can be very robust against several kinds of
bias, vandalism, incompetence, and other quality deficits. That is in
part because of the same reasons which make disclosing reviewer
identity less accurate.
I am trying to figure out how the reviewer reputation database can be
audited without disclosing reviewer identities. It's very difficult to
imagine the release any information about the reviewers and their
actions, even in aggregate, without exposing information which could
potentially be used to game an unfair advantage, but if the identities
are double-blinded and other related information is coded, it's still
possible to audit what seem be some fairly important aspects of the
system's operation without any obvious vulnerabilities.
Cost is a huge part of the system's operation, and there are so many
aspects of cost that need to be measured in many different ways. The
system should be open to volunteers as well as paid reviewers in order
to make sure it has throughput guarantees, but setting the payment
schedule is difficult in an international context where the cost of
living varies so widely. My initial set of paid reviewers will be
around 96% in the US if I remember right.
If anyone wants to try the same thing simultaneously, please do.
Eliminating duplication of effort would be a nice problem to have down
the road.
Subject: Wiki-research-l Digest, Vol 120, Issue 16
Can someone who has access to that paper please share the method and results as fair use?
Here are a few key paragraphs from the methods and results section.
Method
"We test two “social conditions” Anonymous vs. Identity disclosure ( anonymous : participants do not know each other; identity : participants were first asked to introduce themselves before experiments and, during the experiments, participants were shown name of their reviewer on the computer screen and vice versa) . In addition to social condition, we introduce “repeated matching” ( 1 round : matching of counterpart and reviewer changes every round; 3 round : each participant will be matched with the same counterpart and the same reviewer for three consecutive rounds) as another dimension to compare the effect of short-term and long-term social pressure. Employing 2 × 2 design, our experiments consist of four treatments, varying in two dimensions.
Parameters used are P =120, k =0.15. NE prediction is e* =10 under correct-reporting (Proposition 1) and e* =1 under over-reporting (Proposition 2). Expected p rofit is π *=45 and π *=59.85, respectively. Experiments were implemented using z-Tree software (Fischbacher 2007). Participants of experiments consist of 108 undergraduate students at a large pu blic research university in the United States. For Anonymous 1 round treatment, we conducted three experimental sessions. Each of the rest three treatments consisted of two sessions. Each session had 12 participants with 15 decision rounds. Subjects received a course credit and cash payment based on their results of the game."
Findings
"In Anonymous , correct-reporting is most frequently observed at 42% (1-round) and 59% (3-round). In Identity , over-reporting accounts for more than 64% and 72% , respectively. The findings in reporting behaviors are in line with our expectation as shown on Propositions 1 and 2. That is, when identity is revealed, reviewers feel social pressure to be nice toward players. Therefore, evaluation systems are considered less reliable because it fails to reflect the true efforts.
Another finding is that reporting behavior seems st rengthened for a longer period of matching. Repeate d matching seems to make reviewers form a stronger so cial pressure. For example, higher proportion of over-reporting in Identity 3 round suggests an increased social pressure due to expec tation of a long-term relationship between reviewers and players. “
"Median efforts from four treatments are all lower than that of standard economics prediction (assuming objective-r eporting, e* =10). This suggests that players suspect reliability of evaluation systems. For Anonymous 1 round and 3 round where correct-reporting is observed at a high rate, the median efforts are both 8. This high median efforts imply that players in anonymous settings generally hold a stronger belief that evaluation systems are still reliable at some extent.
Surprisingly, for Identity 1 round where scores are mostly inflated, the median effort is still 8. The average effort is at 8.2, even higher than both Anonymous treatments (7.3 and 7.6). This finding suggests a gap between players’ belief and reviewers’ behavior. While reviewers feel strong social pressure by revealing identity, players doubt that social pressure of reviewers will be enough to be nice. However , such doubt disappears when players recognize a long-term relationship. For Identity 3 round , the median and average efforts decrease into 5 and 5.6, respectively, the lowest level among all treatments."
Kevin Crowston | Distinguished Professor of Information Science | School of Information Studies
Syracuse University
348 Hinds Hall
Syracuse, New York 13244
t (315) 443.1676 f 315.443.5806 e mailto:crowston@syr.edu
crowston.syr.edu<http://crowston.syr.edu/>
Thanks, Itzik. Forwarding to the Research list.
Pine
On Aug 13, 2015 2:32 AM, "Itzik - Wikimedia Israel" <itzik(a)wikimedia.org.il>
wrote:
>
>
>
> *Regards,Itzik Edri*
> Chairperson, Wikimedia Israel
> +972-(0)-54-5878078 | http://www.wikimedia.org.il
> Imagine a world in which every single human being can freely share in the
> sum of all knowledge. That's our commitment!
>
>
> ---------- Forwarded message ----------
> From: Itzik - Wikimedia Israel <itzik(a)wikimedia.org.il>
> Date: Thu, Aug 13, 2015 at 12:29 PM
> Subject: [PRESS] Wikipedia suddenly lost a massive amount of traffic from
> Google
> To: Communications Committee <wmfcc-l(a)lists.wikimedia.org>, "A mailing
> list for the Analytics Team at WMF and everybody who has an interest in
> Wikipedia and analytics." <analytics(a)lists.wikimedia.org>
>
>
>
> http://www.businessinsider.in/Wikipedia-suddenly-lost-a-massive-amount-of-t…
>
> From what I see in the ReportCard (http://reportcard.wmflabs.org/), there
> is decrease, but far less than mentioned on this article
>
> ----
>
> Wikipedia suddenly lost a massive amount of traffic from Google
>
> Wikipedia co-founder Jimmy Wales speaks at a press conference ahead of the
> London Wikimania conference in 2014.
> The amount of traffic that Google sends to Wikipedia has declined by more
> than 250 million visits per month, according to SimilarWeb, the traffic
> measurement company.
> Roy Hinkis, SW's head of SEO, says the downgrade is a genuine mystery:
>
> What I saw shocked me. Wikipedia lost an insane amount of traffic in the
> past 3 months. And by insane I mean that the free encyclopedia site lost
> more than 250 million desktop visits in just 3 months!
>
> Here are the most recent numbers, in visits, per SimilarWeb:
>
> May: 2.7 billion
> July: 2.4 billion
> wikipedia
> SimilarWeb.com
>
> Hinkis suspects that Google has changed its search algorithm to favour
> actual brands and company web sites over the Wikipedia entries that are
> about them:
>
> Wikipedia has long since been a huge competitor for brands in terms of
> website traffic. SEO's aren't crazy about it, because it takes a huge chunk
> of our traffic.
>
> However, what Wikipedia taketh from other sites' traffic, Wikipedia also
> giveth: The site is so massive that it also drives a fair amount of traffic
> onward to other sites. But the less traffic Wikipedia gets, the less it can
> give.
>
> Business Insider asked Google for comment but we have not heard back yet,
> so let's speculate.
>
> One of the major trends happening at Google is the company's preference
> for inserting its own content above the content of other non-Google web
> sites, even when those sites may be better resources than Google itself.
> Here is an example. If you're trying to remember who won the World Cup last
> year, you might get this Google result:
>
> wikipedia google
> Google
>
> If you click on that down-arrow that Google provides for the "roster and
> overview," you get a capsule on the German team. That might be all you
> need, and Google believes this is so useful it might save you a click.
>
> The problem is that a few months ago that click might have gone to
> Wikipedia.
>
> Several information aggregation companies are suffering from this, most
> notably Yelp. Yelp has repeatedly and loudly complained that Google's own
> promo boxes on the top of search engine results pages are siphoning traffic
> from better-quality sites that normally would have come top of the results.
>
> The issue is at the heart of the EU's investigation into whether Google is
> abusing its monopoly in Europe to favour its own properties and distort the
> market for search traffic on competing sites.
>
> We're not saying that's what is happening to Wikipedia. But it's certainly
> an unfortunate coincidence.
>
>
>
> *Regards,Itzik Edri*
> Chairperson, Wikimedia Israel
> +972-(0)-54-5878078 | http://www.wikimedia.org.il
> Imagine a world in which every single human being can freely share in the
> sum of all knowledge. That's our commitment!
>
>
>
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
Attached is the announcement and table of contents of a new issue of the open-access journal "Human Computation". No Wikipedia-themed papers this time, but maybe certain articles are still interesting to some on this list.
Gruß,
Fabian
Human Computation – A Transdisciplinary Journal
--------------------------------------------------------------------
Volume 2 • Issue 1 • August 2015
Dear Reader,
Recent gatherings have inspired researchers to address field development challenges in human computation head-on via cogent analysis and carefully conceived recommendations. Some of these key community findings, both technical and ethical, are presented herein. We interpret these activities as a positive signal of community growth and field maturation.
Collectively yours,
Pietro Michelucci & Elena Simperl, Co-Editors-in-Chief
(Read the full Letter from the Editors here: http://goo.gl/i4hRuC)
This issue features:
# Opinion · Crowdsourcing and the Semantic Web: A Research Manifesto #
A roadmap guides the evolution of new research that is emerging at the exciting intersection of crowdsourcing and the Semantic Web.
· Cristina Sarasua et al.
→ read now: http://goo.gl/HrKY8Y
# Research· Privacy in Participatory Research: Advancing Policy to support Human Computation #
How do privacy-related policies align with actual practices in participatory systems? This study surveys the policies of 30 participatory research projects and finds that many host incomplete policies or inaccurately describe their practices.
· Anne Bowser-Livermore & Andrea Wiggins ·
→ read now: http://goo.gl/7CT92Q
# Research· Home is Where the Lab is: A Comparison of Online and Lab Data From a Time-sensitive Study of Interruption #
Is it possible to engage the crowd online in long-duration, time-sensitive tasks that require continuous concentration?
· Sandy J. J. Gould et al. ·
→ read now: http://goo.gl/TAefPW
# Research· Crowdsourcing the mapping problem for design space exploration of custom reconfigurable architecture designs #
Coarse grained reconfigurable architectures (CGRAs) hold great promise for the portable/wearable electronics domain. This study demonstrates one way that the crowd can provide reliable mappings, outperforming a custom Simulated Annealing algorithm.
· Anil Kumar Sistla et al. ·
→ read now: http://goo.gl/9SYmBu
# Research· Crowds of Crowds: Performance based Modeling and Optimization over Multiple Crowdsourcing Platforms #
How can statistical simulations of crowds be used to optimize crowd-worker performance over multiple platforms? A platform recommendation tool can help.
· Sakyajit Bhattacharya, et al. ·
→ read now: http://goo.gl/B3T05o
If you would like to keep up to date on our activities and calls, you can follow us on https://twitter.com/HCjnl, or join us on https://www.facebook.com/hcjournal and https://www.linkedin.com/groups/Human-Computation-interdisciplinary-journal… . If you would like to receive quarterly new issue digests to your email, you can subscribe at https://tinyletter.com/HCjnl