Pursuant to prior discussions about the need for a research
policy on Wikipedia, WikiProject Research is drafting a
policy regarding the recruitment of Wikipedia users to
participate in studies.
At this time, we have a proposed policy, and an accompanying
group that would facilitate recruitment of subjects in much
the same way that the Bot Approvals Group approves bots.
The policy proposal can be found at:
http://en.wikipedia.org/wiki/Wikipedia:Research
The Subject Recruitment Approvals Group mentioned in the proposal
is being described at:
http://en.wikipedia.org/wiki/Wikipedia:Subject_Recruitment_Approvals_Group
Before we move forward with seeking approval from the Wikipedia
community, we would like additional input about the proposal,
and would welcome additional help improving it.
Also, please consider participating in WikiProject Research at:
http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Research
--
Bryan Song
GroupLens Research
University of Minnesota
Hi all,
For all Hive users using stat1002/1004, you might have seen a deprecation
warning when you launch the hive client - that claims it's being replaced
with Beeline. The Beeline shell has always been available to use, but it
required supplying a database connection string every time, which was
pretty annoying. We now have a wrapper
<https://github.com/wikimedia/operations-puppet/blob/production/modules/role…>
script
setup to make this easier. The old Hive CLI will continue to exist, but we
encourage moving over to Beeline. You can use it by logging into the
stat1002/1004 boxes as usual, and launching `beeline`.
There is some documentation on this here:
https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Beeline.
If you run into any issues using this interface, please ping us on the
Analytics list or #wikimedia-analytics or file a bug on Phabricator
<http://phabricator.wikimedia.org/tag/analytics>.
(If you are wondering stat1004 whaaat - there should be an announcement
coming up about it soon!)
Best,
--Madhu :)
We’re glad to announce the release of an aggregate clickstream dataset extracted from English Wikipedia
http://dx.doi.org/10.6084/m9.figshare.1305770 <http://dx.doi.org/10.6084/m9.figshare.1305770>
This dataset contains counts of (referer, article) pairs aggregated from the HTTP request logs of English Wikipedia. This snapshot captures 22 million (referer, article) pairs from a total of 4 billion requests collected during the month of January 2015.
This data can be used for various purposes:
• determining the most frequent links people click on for a given article
• determining the most common links people followed to an article
• determining how much of the total traffic to an article clicked on a link in that article
• generating a Markov chain over English Wikipedia
We created a page on Meta for feedback and discussion about this release: https://meta.wikimedia.org/wiki/Research_talk:Wikipedia_clickstream <https://meta.wikimedia.org/wiki/Research_talk:Wikipedia_clickstream>
Ellery and Dario
Dear Colleagues,
We are very pleased to announce two distinguished Keynotes for the 8th
International Conference on Social Media & Society (July 28-30, 2017,
Toronto, Canada):
* Lee Rainie - Director, Pew Research Center's Internet
<http://www.pewinternet.org/> & American Life Project, USA
* Ronald Deibert - Professor of Political Science, and Director of the
Citizen Lab <http://www.citizenlab.org/> at the Munk School of Global
Affairs, University of Toronto, Canada.
SUBMIT TODAY
We would also like to invite scholarly and original submissions that broadly
relate to the 2017 conference theme on "Social Media for Social Good or
Evil." We welcome both quantitative and qualitative work which crosses
interdisciplinary boundaries and expands our understanding of the current
and future trends in social media research. See the call for proposals at
https://socialmediaandsociety.org/2016/cfp-2017-international-conference-soc
ial-media-society/
DEADLINES
* Workshops/ Technical Tutorials - Due December 5, 2016
* Full and Work-in-progress (WIP) Papers - Due January 16, 2017
* Posters - Due March 6, 2017
PUBLISHING OPPORTUNITIES
Full and WIP (short) papers presented will be published in the conference
proceedings by ACM International Conference Proceeding Series (ICPS)
<http://dl.acm.org/citation.cfm?id=2930971&CFID=847001369&CFTOKEN=15617273&p
reflayout=flat#prox> and will be available in the ACM Digital Library. All
conference presenters will be invited to submit their extended conference
papers to a special issue of the Social Media + Society
<http://sms.sagepub.com/> journal ( <http://sms.sagepub.com/>
http://sms.sagepub.com/) published by SAGE.
2017 #SMSociety Organizing Committee:
* Anatoliy Gruzd, Ryerson University, Canada - Conference Chair
* Jenna Jacobson, University of Toronto, Canada - Conference Chair
* Philip Mai, Ryerson University, Canada - Conference Chair
* Hazel Kwon, Arizona State University, USA - Poster Chair
* Bernie Hogan, Oxford Internet Institute, UK - WIP Chair
* Jeff Hemsley, Syracuse University, USA - WIP Chair
Advisory Board:
* William H. Dutton, Michigan State University, USA
* Zizi Papacharissi, University of Illinois at Chicago, USA
* Barry Wellman, INSNA Founder, The Netlab Network, Canada
Programme Committee:
* Visit: http://socialmediaandsociety.org/about/
If you have any questions, please contact us via email at
ask(a)socialmediaandsociety.org <mailto:ask@socialmediaandsociety.org> or on
Twitter at @SocMediaConf <https://twitter.com/SocMediaConf>
Please comment on whether to approve the "Conflict of interest" section
of the draft Code of conduct for technical spaces. This section did not
reach not reach consensus earlier, but changes were made seeking to
address the previous concerns.
You can find the section at
https://www.mediawiki.org/w/index.php?title=Code_of_Conduct/Draft&oldid=229…
You can comment at
https://www.mediawiki.org/wiki/Talk:Code_of_Conduct/Draft#Finalize_new_vers…
.
A position and brief comment is fine.
You can also send private feedback to conduct-discussion(a)wikimedia.org .
I really appreciate your participation.
Thanks again,
Matt Flaschen
+research-l because this is more of a research than an analytics question.
Hi Jane,
What do you mean by acronyms in deletion queues here? Are you talking about
policy links used to justify !votes in deletion discussions, or acronyms
used in deletion comments of AfD'd articles? Or something else entirely.
If #1, this paper
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.835.9466&rep=rep1&…>
examines the use of a single policy (IAR) in AfD's over time.
If #2, I did a similar (quick and dirty) analysis with AfC recently, here:
https://quarry.wmflabs.org/query/13341
Others may be aware of additional resources or analyses.
Best,
Jonathan
On Tue, Nov 22, 2016 at 7:10 AM, Jane Darnell <jane023(a)gmail.com> wrote:
> Hi all,
> Has anyone tried to find the frequency of acronyms used in AfD queues? Any
> information about the deletion queue in language is welcome, thanks.
>
> This came up during a discussion about "enyclopedia worthiness" and how to
> explain this concept to newbies.
> Jane
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
We're having some major technical issues with Hangout on Air, if you're
joining us for the showcase please use this link, we'll be starting
shortly: https://www.youtube.com/watch?v=xIaMuWA84bY
Dario
Dear all,
The Wikimedia Foundation datasets collection on the Internet Archive
[1] has now surpassed 1 million items (and about 50,000 full database
dumps)! This marks a major milestone in our archiving efforts of
Wikimedia's vast amount of data and ensures that the vital content
submitted by volunteers across the moment is preserved. All these
would not have been possible without the help of many people,
including Nemo, Ariel and Emijrp (thanks!).
We started archiving towards the end of 2011 and reached a milestone
of half a million items back in June 2015. [2] We have since moved on
from archiving just the main database dumps to saving research-worthy
data such as the pageviews data and even attempting to keep a copy of
Wikimedia Commons. Today, we are working on making the items on the
Internet Archive more accessible for researchers by working on an
interface for searching old dumps.
Despite this feat, we are in constant need of more help. If you are a
researcher, a programmer or someone with a computer, we need your help
in many tasks! Have a look at WikiTeam's project [3] or Emijrp's
Wikipedia Archive page [4] for more information. If you regularly work
on the Wikimedia database dumps, please provide your input in the
Dumps-Rewrite project [5] and the API interface [6].
As before, here's to the next million!
[1]: https://archive.org/details/wikimediadownloads
[2]: https://groups.google.com/forum/#!msg/wikiteam-discuss/Vj3oonpYphg/h9HE6r3v…
[3]: https://github.com/WikiTeam/wikiteam
[4]: https://en.wikipedia.org/wiki/User:Emijrp/Wikipedia_Archive
[5]: https://phabricator.wikimedia.org/tag/dumps-rewrite/
[6]: https://phabricator.wikimedia.org/T147177
--
Hydriz Scholz