Wiki-research-l November 2016

wiki-research-l@lists.wikimedia.org

21 participants
16 discussions

Wikipedia Research policy
by song＠cs.umn.edu 14 Jul '23

14 Jul '23

Pursuant to prior discussions about the need for a research policy on Wikipedia, WikiProject Research is drafting a policy regarding the recruitment of Wikipedia users to participate in studies. At this time, we have a proposed policy, and an accompanying group that would facilitate recruitment of subjects in much the same way that the Bot Approvals Group approves bots. The policy proposal can be found at: http://en.wikipedia.org/wiki/Wikipedia:Research The Subject Recruitment Approvals Group mentioned in the proposal is being described at: http://en.wikipedia.org/wiki/Wikipedia:Subject_Recruitment_Approvals_Group Before we move forward with seeking approval from the Wikipedia community, we would like additional input about the proposal, and would welcome additional help improving it. Also, please consider participating in WikiProject Research at: http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Research -- Bryan Song GroupLens Research University of Minnesota

8 10

[Analytics] Beeline as Hive client
by Madhumitha Viswanathan 02 Oct '18

02 Oct '18

Hi all, For all Hive users using stat1002/1004, you might have seen a deprecation warning when you launch the hive client - that claims it's being replaced with Beeline. The Beeline shell has always been available to use, but it required supplying a database connection string every time, which was pretty annoying. We now have a wrapper <https://github.com/wikimedia/operations-puppet/blob/production/modules/role…> script setup to make this easier. The old Hive CLI will continue to exist, but we encourage moving over to Beeline. You can use it by logging into the stat1002/1004 boxes as usual, and launching `beeline`. There is some documentation on this here: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Beeline. If you run into any issues using this interface, please ping us on the Analytics list or #wikimedia-analytics or file a bug on Phabricator <http://phabricator.wikimedia.org/tag/analytics>. (If you are wondering stat1004 whaaat - there should be an announcement coming up about it soon!) Best, --Madhu :)

2 2

Wikipedia aggregate clickstream data released
by Dario Taraborelli 17 Jan '18

17 Jan '18

We’re glad to announce the release of an aggregate clickstream dataset extracted from English Wikipedia http://dx.doi.org/10.6084/m9.figshare.1305770 <http://dx.doi.org/10.6084/m9.figshare.1305770> This dataset contains counts of (referer, article) pairs aggregated from the HTTP request logs of English Wikipedia. This snapshot captures 22 million (referer, article) pairs from a total of 4 billion requests collected during the month of January 2015. This data can be used for various purposes: • determining the most frequent links people click on for a given article • determining the most common links people followed to an article • determining how much of the total traffic to an article clicked on a link in that article • generating a Markov chain over English Wikipedia We created a page on Meta for feedback and discussion about this release: https://meta.wikimedia.org/wiki/Research_talk:Wikipedia_clickstream <https://meta.wikimedia.org/wiki/Research_talk:Wikipedia_clickstream> Ellery and Dario

5 5

Data on the lifespan of Wikipedia articles
by Stella Yu 01 Dec '16

01 Dec '16

Where could I find data on the lifespan of different types of Wikipedia articles? Thanks in advance for the help!

5 6

Announcement of Keynotes & 2nd Call for Papers: 8th International Conference on Social Media & Society #SMSociety (July 28-30, 2017, Toronto, Canada)
by Anatoliy 30 Nov '16

30 Nov '16

Dear Colleagues, We are very pleased to announce two distinguished Keynotes for the 8th International Conference on Social Media & Society (July 28-30, 2017, Toronto, Canada): * Lee Rainie - Director, Pew Research Center's Internet <http://www.pewinternet.org/> & American Life Project, USA * Ronald Deibert - Professor of Political Science, and Director of the Citizen Lab <http://www.citizenlab.org/> at the Munk School of Global Affairs, University of Toronto, Canada. SUBMIT TODAY We would also like to invite scholarly and original submissions that broadly relate to the 2017 conference theme on "Social Media for Social Good or Evil." We welcome both quantitative and qualitative work which crosses interdisciplinary boundaries and expands our understanding of the current and future trends in social media research. See the call for proposals at https://socialmediaandsociety.org/2016/cfp-2017-international-conference-soc ial-media-society/ DEADLINES * Workshops/ Technical Tutorials - Due December 5, 2016 * Full and Work-in-progress (WIP) Papers - Due January 16, 2017 * Posters - Due March 6, 2017 PUBLISHING OPPORTUNITIES Full and WIP (short) papers presented will be published in the conference proceedings by ACM International Conference Proceeding Series (ICPS) <http://dl.acm.org/citation.cfm?id=2930971&CFID=847001369&CFTOKEN=15617273&p reflayout=flat#prox> and will be available in the ACM Digital Library. All conference presenters will be invited to submit their extended conference papers to a special issue of the Social Media + Society <http://sms.sagepub.com/> journal ( <http://sms.sagepub.com/> http://sms.sagepub.com/) published by SAGE. 2017 #SMSociety Organizing Committee: * Anatoliy Gruzd, Ryerson University, Canada - Conference Chair * Jenna Jacobson, University of Toronto, Canada - Conference Chair * Philip Mai, Ryerson University, Canada - Conference Chair * Hazel Kwon, Arizona State University, USA - Poster Chair * Bernie Hogan, Oxford Internet Institute, UK - WIP Chair * Jeff Hemsley, Syracuse University, USA - WIP Chair Advisory Board: * William H. Dutton, Michigan State University, USA * Zizi Papacharissi, University of Illinois at Chicago, USA * Barry Wellman, INSNA Founder, The Netlab Network, Canada Programme Committee: * Visit: http://socialmediaandsociety.org/about/ If you have any questions, please contact us via email at ask(a)socialmediaandsociety.org <mailto:ask@socialmediaandsociety.org> or on Twitter at @SocMediaConf <https://twitter.com/SocMediaConf>

1 0

Weigh in on whether to finalize "Conflict of interest" sections of Code of Conduct
by Matthew Flaschen 23 Nov '16

23 Nov '16

Please comment on whether to approve the "Conflict of interest" section of the draft Code of conduct for technical spaces. This section did not reach not reach consensus earlier, but changes were made seeking to address the previous concerns. You can find the section at https://www.mediawiki.org/w/index.php?title=Code_of_Conduct/Draft&oldid=229… You can comment at https://www.mediawiki.org/wiki/Talk:Code_of_Conduct/Draft#Finalize_new_vers… . A position and brief comment is fine. You can also send private feedback to conduct-discussion(a)wikimedia.org . I really appreciate your participation. Thanks again, Matt Flaschen

1 0

Re: [Wiki-research-l] [Analytics] Question about data mining of the "Articles for Deletion" queues
by Jonathan Morgan 22 Nov '16

22 Nov '16

+research-l because this is more of a research than an analytics question. Hi Jane, What do you mean by acronyms in deletion queues here? Are you talking about policy links used to justify !votes in deletion discussions, or acronyms used in deletion comments of AfD'd articles? Or something else entirely. If #1, this paper <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.835.9466&rep=rep1&…> examines the use of a single policy (IAR) in AfD's over time. If #2, I did a similar (quick and dirty) analysis with AfC recently, here: https://quarry.wmflabs.org/query/13341 Others may be aware of additional resources or analyses. Best, Jonathan On Tue, Nov 22, 2016 at 7:10 AM, Jane Darnell <jane023(a)gmail.com> wrote: > Hi all, > Has anyone tried to find the frequency of acronyms used in AfD queues? Any > information about the deletion queue in language is welcome, thanks. > > This came up during a discussion about "enyclopedia worthiness" and how to > explain this concept to newbies. > Jane > > _______________________________________________ > Analytics mailing list > Analytics(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- Jonathan T. Morgan Senior Design Researcher Wikimedia Foundation User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>

3 2

supporting Wikipedia citations as formal scholarly reputation in tenure committees
by James Salsman 18 Nov '16

18 Nov '16

Can anyone familiar with European Commission procedure please explain how to support the Wikipedia-associated proposals in [1] based on the statistics in [2] please? Very recent publications such as [3] in Nature along with what appears to be a relatively sudden groundswell of frankness and support e.g. [4] suggests to me that the time is right to get out in front of these proposals. [1] https://www.researchgate.net/profile/David_Nicholas5/publication/275349828_… [2] http://www.cs.indiana.edu/~xshuai/papers/jcdl240-shuai.pdf [3] http://www.nature.com/news/fewer-numbers-better-science-1.20858 [4] http://blog.scielo.org/en/2016/10/14/is-it-possible-to-normalize-citation-m…

2 3

New hangout link for Research Showcase
by Dario Taraborelli 16 Nov '16

16 Nov '16

We're having some major technical issues with Hangout on Air, if you're joining us for the showcase please use this link, we'll be starting shortly: https://www.youtube.com/watch?v=xIaMuWA84bY Dario

1 0

Wikimedia datasets collection on the Internet Archive has surpassed 1 million items
by Hydriz Scholz 14 Nov '16

14 Nov '16

Dear all, The Wikimedia Foundation datasets collection on the Internet Archive [1] has now surpassed 1 million items (and about 50,000 full database dumps)! This marks a major milestone in our archiving efforts of Wikimedia's vast amount of data and ensures that the vital content submitted by volunteers across the moment is preserved. All these would not have been possible without the help of many people, including Nemo, Ariel and Emijrp (thanks!). We started archiving towards the end of 2011 and reached a milestone of half a million items back in June 2015. [2] We have since moved on from archiving just the main database dumps to saving research-worthy data such as the pageviews data and even attempting to keep a copy of Wikimedia Commons. Today, we are working on making the items on the Internet Archive more accessible for researchers by working on an interface for searching old dumps. Despite this feat, we are in constant need of more help. If you are a researcher, a programmer or someone with a computer, we need your help in many tasks! Have a look at WikiTeam's project [3] or Emijrp's Wikipedia Archive page [4] for more information. If you regularly work on the Wikimedia database dumps, please provide your input in the Dumps-Rewrite project [5] and the API interface [6]. As before, here's to the next million! [1]: https://archive.org/details/wikimediadownloads [2]: https://groups.google.com/forum/#!msg/wikiteam-discuss/Vj3oonpYphg/h9HE6r3v… [3]: https://github.com/WikiTeam/wikiteam [4]: https://en.wikipedia.org/wiki/User:Emijrp/Wikipedia_Archive [5]: https://phabricator.wikimedia.org/tag/dumps-rewrite/ [6]: https://phabricator.wikimedia.org/T147177 -- Hydriz Scholz

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Wiki-research-l November 2016