Wiki-research-l June 2012

wiki-research-l@lists.wikimedia.org

16 participants
16 discussions

by song＠cs.umn.edu

Pursuant to prior discussions about the need for a research policy on Wikipedia, WikiProject Research is drafting a policy regarding the recruitment of Wikipedia users to participate in studies. At this time, we have a proposed policy, and an accompanying group that would facilitate recruitment of subjects in much the same way that the Bot Approvals Group approves bots. The policy proposal can be found at: http://en.wikipedia.org/wiki/Wikipedia:Research The Subject Recruitment Approvals Group mentioned in the proposal is being described at: http://en.wikipedia.org/wiki/Wikipedia:Subject_Recruitment_Approvals_Group Before we move forward with seeking approval from the Wikipedia community, we would like additional input about the proposal, and would welcome additional help improving it. Also, please consider participating in WikiProject Research at: http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Research -- Bryan Song GroupLens Research University of Minnesota

9 months, 2 weeks

Statistics on wiki page editing/creation by country?

by alina ostling

Hi! I am doing a PhD on online civic participation project (e-participation). Within my research, I have carried out a user survey, where I asked how many people ever edited/created a page on a Wiki. Now I would like to compare the results with the overall rate of wiki editing/creation on country level. I've found some country-level statistics on Wikipedia Statistics (e.g. 3,000 editors of Wikipedia articles in Italy) but data for UK and France are not available since Wikipedia provides statistics by languages, not by countries. I'm thus looking for statistics on UK and France (but am also interested in alternative ways of measuring wiki editing/creation in Sweden and Italy). I would be grateful for any tips! Sunny regards, Alina -- Alina ÖSTLING PhD Candidate European University Institute www.eui.eu

9 years, 7 months

WikiPapers has now over 1,000 publications

by emijrp

Hi all; WikiPapers has reached recently the 1,000 publications milestone.[1] Looks like the publication rate peaked in 2009 and has plateaued in the last 3 years. I continue adding more data... but with little help. Don't you like editing wikis? ; ) Regards, emijrp [1] http://wikipapers.referata.com/wiki/List_of_publications -- Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com Pre-doctoral student at the University of Cádiz (Spain) Projects: AVBOT <http://code.google.com/p/avbot/> | StatMediaWiki<http://statmediawiki.forja.rediris.es> | WikiEvidens <http://code.google.com/p/wikievidens/> | WikiPapers<http://wikipapers.referata.com> | WikiTeam <http://code.google.com/p/wikiteam/> Personal website: https://sites.google.com/site/emijrp/

10 years, 6 months

wikitweets: view tweets that reference wikipedia in realtime

by Ed Summers

This is more on the experimental side of "research" but I just finished a prototype realtime visualization of tweets that reference Wikipedia: http://wikitweets.herokuapp.com/ wikitweets is a NodeJS [1] application that listens to the Twitter Streaming API [2] for tweets that contain Wikipedia URLs, and then looks up the relevant Wikipedia article using the API to ultimately stream the information to the browser using SocketIO [3]. The most amazing thing for me is seeing the application run comfortably (so far) as a single process on Heroku with no attached database needed. If you are curious the code is on GitHub [4]. The key to wikistream working at all is that Twitter allows you to search and filter the stream using the original (unshorted) URL. So for example a Tweet with the text: Question of the Day: What’s the greatest seafaring movie ever? Some suggestions: http://bit.ly/IqsE1e (But anything on water'll work) #QOD [5] Is discoverable with a search query like: Question of the Day wikipedia.org [6] Note "wikipedia.org" doesn't exist in the text of the original tweet at all, since it has been shortened by bit.ly -- but it is still searchable because Twitter appear to be unshortening and indexing URLs. Anyhow, I thought I'd share here since this also relied heavily on the various language Wikipedia APIs. //Ed [1] http://nodejs.org [2] https://dev.twitter.com/docs/streaming-api/methods [3] http://socket.io [4] https://github.com/edsu/wikitweets [5] https://twitter.com/#!/EWeitzman/status/195520487357558784 [6] https://twitter.com/#!/search/realtime/Question%20of%20the%20Day%20wikipedi…

11 years, 7 months

Is use of diacritics common?

by Piotr Konieczny

I wonder if anybody has studied the use of diacritics on Wikipedia? It is my experience that they are commonly used, but some editors are challenging that (http://en.wikipedia.org/w/index.php?title=Wikipedia:Naming_conventions_(use…). I wonder if we have any data on that, or if somebody could create and run some form of query on the database (as I don't code, this is unfortunately beyond my capabilities). -- Piotr Konieczny PhD Candidate Dept of Sociology Uni of Pittsburgh http://pittsburgh.academia.edu/PiotrKonieczny/ http://en.wikipedia.org/wiki/User:Piotrus

11 years, 9 months

More accurate revert detection in Wikipedia, alternative to MD5 identical revision method

by Floeck, Fabian (AIFB)

For those of you who are interested in reverts: I just presented our paper on accurate revert detection at the ACM Hypertext and Social Media conference 2012, showing a significant accuracy (and coverage) gain compared to the widely used method of finding identical revisions (via MD5 hash values) to detect reverts, proving that our method detects edit pairs that are significantly more likely to be actual reverts according to editors perception of a revert and the Wikipedia definition. 35% of the reverts found by the MD5 method in our sample are not assessed to be reverts by more than 80% of our survey participants (accuracy 0%). The provided new method finds different reverts for these 35% plus 12% more, which show a 70% accuracy. Find the PDF slides, paper and results here: http://people.aifb.kit.edu/ffl/reverts/ I'll be happy to answer any questions. More in detail: The MD5 hash method employed by many researchers to identify reverts (as some others, like using edit comments) is acknowledged to produce some inaccuracies as far as the Wikipedia definition of a revert ("reverses the actions of any editors", "undoing the actions"..) is concerned. The extent of these inaccuracies is usually judged to be not too large, as naturally, most reverting edits are carried out immediately after the edit to be reverted, being an "identity revert" (Wikipedia definition: "..normally results in the page being restored to a version that existed previously"). Still, there has not been a user evaluation assessing how well the detected reverts conform with the Wikipedia definition and what users actually perceive as a revert. We developed and evaluated an alternative method to the MD5 identity revert and show a significant increase in accuracy (and coverage). 34% of the reverts detected by the MD5 hash method in our sample actually fail to be acknowledged as full reverts by more than 80% of users in our study, while our new method performs much better, finding different reverts for these 34% wrongly detected reverts plus 12% more reverts, showing an accuracy of 70% for these newly found edit pairs actually being reverts according to the users. The increased accuracy performance between the reverts detected only by the MD5 and only by our new method is highly significant, while reverts detected by both methods also perform significantly better than those only detected by the MD5 method. Trade-off: Although this method is much slower than the MD5 method (as it is using DIFFs between revisions) it reflects much better what users (and the Wikipedia community as a whole) see as a revert. It thereby is a valid alternative if you are interested in the antagonistic relationships between users on a more detailed and accurate level. There is quite some potential to make it even faster by combining the two methods, decreasing the number of DIFFs to be performed, let's see if we can come around doing that :) The scripts and results listed in the paper can be found at http://people.aifb.kit.edu/ffl/reverts/ Best, Fabian -- Karlsruhe Institute of Technology (KIT) Institute of Applied Informatics and Formal Description Methods Dipl.-Medwiss. Fabian Flöck Research Associate Building 11.40, Room 222 KIT-Campus South D-76128 Karlsruhe Phone: +49 721 608 4 6584 Skype: f.floeck_work E-Mail: fabian.floeck(a)kit.edu<mailto:fabian.floeck@kit.edu> WWW: http://www.aifb.kit.edu/web/Fabian_Flöck KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association

11 years, 9 months

Wikipedia Historical Attributes Data Set online available for download

by Angelika Adam

Hi all, Guillermo Garrido (NLP Group, UNED, Spain) and Enrique Alfonseca Google Research Zurich, one of our partners in the RENDER project [1] extracted a data set that contains all attribute-value pairs of info boxes out of English Wikipedia articles since 2003. This 5.5 GB large data set, which is called Wikipedia Historical Attributes Data (WHAD), is freely available on the download page of the RENDER toolkit [2]. More detailed information about the data set can be found at Enrique Alfonseca's website [3]. Enrique will attend the Wikipedia Academy 2012 [4] and is going to present his work during the Paper Session III: Analyzing Wikipedia Article Data [5] on Saturday. A short preview of this paper was published in the current Research:Newsletter [6]. Best regards from Berlin, Angelika [1] http://meta.wikimedia.org/wiki/RENDER [2] http://toolserver.org/~RENDER/toolkit/downloads/ [3] http://alfonseca.org/eng/research/whad.html [4] http://wikipedia-academy.de/ [5] http://wikipedia-academy.de/2012/wiki/Schedule#Paper_Session_III:_Analysing… [6] https://meta.wikimedia.org/wiki/Research:Newsletter/2012-06-25 -- Angelika Adam Projektmanagerin Wikimedia Deutschland e.V. | Obentrautstraße 72 | 10963 Berlin Tel. (030) 219 158 260 http://www.wikimedia.de/ Stellen Sie sich eine Welt vor, in der jeder Mensch freien Zugang zu der Gesamtheit des Wissens der Menschheit hat. Helfen Sie uns dabei! Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

11 years, 9 months

The Wikimedia Research Newsletter 2(6) is out

by Tilman Bayer

The new Wikimedia Research Newsletter is out: https://meta.wikimedia.org/wiki/Research:Newsletter/2012-06-25 In this issue: 1 Dynamics of edit wars 2 Who deletes Wikipedia 3 Evaluating and predicting interlingual links in Wikipedia 4 "Wikipedia Academy" preview 5 Special issue of "Digithum" on Wikipedia research 6 Briefly ••• 26 publications were covered in this issue ••• Thanks to Piotr Konieczny, Evan Rosen and Daniel Mietchen for their contributions There's more: * Follow us on https://twitter.com/#!/WikiResearch or https://identi.ca/wikiresearch * Receive this newsletter by mail: https://lists.wikimedia.org/mailman/listinfo/research-newsletter * Subscribe to the RSS feed: https://blog.wikimedia.org/c/research-2/wikimedia-research-newsletter/feed/ * Download the full 45-page PDF of Volume 1 (2011) and a dataset of all references covered in it: http://blog.wikimedia.org/?p=10655 Tilman Bayer and Dario Taraborelli -- Tilman Bayer Senior Operations Analyst (Movement Communications) Wikimedia Foundation IRC (Freenode): HaeB

11 years, 10 months

Dynamics of Conflicts in Wikipedia

by Taha Yasseri

Dear Wikipedia researchers! Our manuscript on is now released by PLoS ONE and available at: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0038869 I would delightedly take your comments and remarks. bests .Taha Dr. Taha Yasseri. --------------------------------------------- www.phy.bme.hu/~yasseri Department of Theoretical Physics Institute of Physics Budapest University of Technology and Economics Budafoki út 8. H-1111 Budapest, Hungary tel: +36 1 463 4110 fax: +36 1 463 3567 --------------------------------------------- -- Taha.

11 years, 10 months

Re: [Wiki-research-l] Upcoming hackathon for experts AND newbies: Washington, DC, USA July 10-11

by Sumana Harihareswara

On 06/19/2012 03:41 AM, Sumana Harihareswara wrote: > This is a reminder that you're invited to the pre-Wikimania hackathon, > 10-11 July in Washington, DC, USA: > > https://wikimania2012.wikimedia.org/wiki/Hackathon > > In order to come, you have to register for the Wikimania conference: > > https://wikimania2012.wikimedia.org/wiki/Registration > > (Unfortunately, the period for requesting scholarships is now over.) > > At the hackathon, we'll have trainings and projects for novices, and we > welcome creators of all Wikimedia technologies -- MediaWiki, gadgets, > bots, mobile apps, you name it -- to hack on stuff together and teach > each other. > > Hope to see you! Actually, you don't have to register for Wikimania to come to the hackathon. The registration fee is only required for the main conference days; everyone is welcome to come to the hackathon days and unconference for free. So tell your DC friends to sign up at https://wikimania2012.wikimedia.org/wiki/Hackathon and come! -- Sumana Harihareswara Engineering Community Manager Wikimedia Foundation

11 years, 10 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Wiki-research-l June 2012