For all Hive users using stat1002/1004, you might have seen a deprecation
warning when you launch the hive client - that claims it's being replaced
with Beeline. The Beeline shell has always been available to use, but it
required supplying a database connection string every time, which was
pretty annoying. We now have a wrapper
setup to make this easier. The old Hive CLI will continue to exist, but we
encourage moving over to Beeline. You can use it by logging into the
stat1002/1004 boxes as usual, and launching `beeline`.
There is some documentation on this here:
If you run into any issues using this interface, please ping us on the
Analytics list or #wikimedia-analytics or file a bug on Phabricator
(If you are wondering stat1004 whaaat - there should be an announcement
coming up about it soon!)
Thanks for your detailed email. Agree on all the comments.
Some earlier comments might have been harsh, but I understand that there is
a valid reason behind it and also the dedication of so many people involved
to help reach Wikipedia where it is today.
We should have been more diligent in finding out policies and rules
(including IRB) before entering content on Wikipedia. We promise not to
repeat anything of this sort in the future and also I am trying to
summarize all that has been discussed here to prevent such unpleasant
experiences from other researchers in this area.
WikiConference North America 2016
7-10 October 2016, San Diego, CA, USA
SUBMISSIONS DEADLINE: August 31, 11:59pm Samoa Time!
WikiConference North America (formerly WikiConference USA) is the third
annual conference on the North American continent devoted to Wikipedia and
other Wikimedia projects. The weekend will feature both academic and casual
presentations on Wikimedia-related outreach activities, workshops to
improve the skills of grassroots organizers, and discussions on the past,
present, and future of the Wikimedia projects. The conference features
offerings about community outreach, online activity, partnerships with
institutions of knowledge, and technology.
Keynote speakers are scheduled to include Katherine Maher, Executive
Director of the Wikimedia Foundation, and Merrilee Proffitt, Senior Program
Officer of OCLC Research. The last day of the conference will feature
programming coinciding with Indigenous Peoples' Day.
Registration for the conference is now open. You can register at
Scholarships partially covering costs of travel and attendance are
available for active contributors to Wikimedia projects. Apply by August
23rd for scholarships at https://wikiconference.org/wiki/2016/Scholarships.
This is a volunteer run conference and volunteers are needed for any number
of tasks. If you are attending, please consider volunteering for at
We seek presentations addressing topics related to Wikipedia or open access
and culture. Presentations may be from any discipline regarding any
relevant topic. Please submit a description of your proposed presentation
using our online submission process at https://wikiconference.org/
wiki/Submissions. If you are interested in participating in the
peer-reviewed academic track, see our call for academic submissions at
- Sydney Poore (User:FloNight) and Rosie Stephenson-Goodknight
(User:Rosiestep), conference organizers
Thinking big here: popular internationalized computer games can have 10+
million unit sales. Some of the most popular online games have millions of
monthly active users. I'm wondering if the research community, including
Design Research, can envision a way for Wikimedia to scale up from 80,000
active monthly users to 8,000,000 active monthly users.
What would we need in order to stimulate and nourish this kind of growth?
What can we learn from popular internationalized games about design that
could benefit Wikimedia on a large scale?
Hi research community,
I have been struggling to find a way to present the subject of Wikipedia's
quality to newbies for the LearnWiki video series.
On the one hand, I have heard of studies that compare Wikipedia favorably
to Britannica, and studies showing that medical students and licensed
professionals consult Wikipedia. On the other hand, we have lots of stub
articles, and in the August WMF Metrics and Activities meeting we heard
that some users are skeptical of the quality of an encyclopedia that anyone
I am thinking that quality of an article, as well as quality of Wikipedia
as a whole (in varied language editions), could be measured in terms of
completeness, verifiability, and neutrality, assuming that measures of
those three dimensions are possible.
For completeness, I found
https://en.wikipedia.org/wiki/Wikipedia:Completeness, which has fascinating
if somewhat unhelpful descriptions of encyclodynamics and encyclostatics.
Another resource, which may be a bit more helpful, is
I believe that there have been studies about vandalism reversions, and the
accuracy or inaccuracy of Wikipedia science articles. I am wondering if
there have been any more holistic studies of Wikipedia completeness,
verifiability, and neutrality, either using sampling methods or using
I agree with pretty much all that Bob says here, except one important
point: This is probably correct for Wikipedia in English, and maybe a few
other very big languages.
A rarely remembered fact: most people don't know English.
In other languages there's much work to do in writing articles on math,
history, geography, medicine and what not (and dictionaries and textbooks
and public domain works), but a lot of potential people who would do it
fall into two categories:
1. People who know English and can read the English Wikipedia article and
don't notice that an article in their language is missing.
2. People who don't know English and can neither translate from the English
Wikipedia nor other English-only sources.
The upcoming Task List feature in Content Translation (
https://phabricator.wikimedia.org/T96147 ) will try to address this by
giving people a more convenient way to see the gaps and fill them, although
it will be only a technical tool, which cannot solve everything by itself.
As Bob notes, targeted outreach to experts will be needed as well.
בתאריך 28 באוג׳ 2016 22:27, "Bob Kosovsky" <bobkosovsky(a)nypl.org> כתב:
I've been active with Wikipedia since 2006. My impression (which
corresponds with data) is that 2008 was the year with the highest number of
editors on English Wikipedia. While it may sound good on paper, in some
ways it was a mess because of the frequency of vandalism. Nowadays I know
there are more automated techniques for detecting vandalism, but if you
want to increase the number of users just to make the stats look good,
you're going to get more dubious data into the encyclopedia as well as
frustration from editors who dislike spending their time on so much
maintenance (although I'm sure there are some editors who would jump at the
chance to make corrections all day).
I suspected from the outset of Wikipedia's creation that the project would
mirror the well-known "life cycle of email lists" as I've always believed
Wikipedia is a "social encyclopedia." I feel this well-known meme
accurately reflect's Wikipedia's evolution so I repeat it here as a tool
from which to learn:
*1. Initial enthusiasm* (people introduce themselves, and gush a lot about
how wonderful it is to find kindred souls).
*2. Evangelism* (people moan about how few folks are posting to the list,
and brainstorm recruitment strategies).
*3. Growth* (more and more people join, more and more lengthy threads
develop, occasional off-topic threads pop up).
*4. Community* (lots of threads, some more relevant than others; lots of
information and advice is exchanged; experts help other experts as well as
less experienced colleagues; friendships develop; people tease each other;
newcomers are welcomed with generosity and patience; everyone -- newbie and
expert alike -- feels comfortable asking questions, suggesting answers, and
*5. Discomfort with diversity* (the number of messages increases
dramatically; not every thread is fascinating to every reader; people start
complaining about the signal-to-noise ratio; person 1 threatens to quit if
*other* people don't limit discussion to person 1's pet topic; person 2
agrees with person 1; person 3 tells 1 & 2 to lighten up; more bandwidth is
wasted complaining about off-topic threads than is used for the threads
themselves; everyone gets annoyed).
*6a. Smug complacency and stagnation* (the purists flame everyone who asks
an 'old' question or responds with humor to a serious post; newbies are
rebuffed; traffic drops to a doze-producing level of a few minor issues;
all interesting discussions happen by private email and are limited to a
few participants; the purists spend lots of time self-righteously
congratulating each other on keeping off-topic threads off the list).
*6b. Maturity* (a few people quit in a huff; the rest of the participants
stay near stage 4, with stage 5 popping up briefly every few weeks; many
people wear out their second or third 'delete' key, but the list lives
contentedly ever after).
I feel Wikipedia is at stage 6 (both a and b). Unless there's a significant
change in functionality and design, the days of 2008 will never return, and
we should stop bothering to think it's possible to replicate them (because
their existence was due to the novelty of the project).
Instead, I think Wikimedia projects should cultivate those individuals with
specialized knowledge. A lot of these people are in specialized
communities (for example educators, medical professionals,
researchers/scholars, devoted amateurs). These are communities which
formerly looked down on Wikipedia but now are reconsidering their formerly
negative opinions of the encyclopedia. I feel the as-yet small successes in
the medical and GLAM communities (I am sure there are others) show great
promise. Being part of the GLAM community, I know there are outreach
efforts underway to others within that community. Being part of WM NYC, I
know there's a lot of librarians involved in chapter activities--and most
of those activities take place in libraries or museums (often museum
Until this year, the WMF showed no real interest in continuous engagement
and dialogue with the community that edits the projects. I totally agree
with the person who said WMF needs to have a marketing department. This is
especially true for the kinds of research which marketers report on and are
typical of any organization, profit or non-profit. That would be a first
step: Understanding who are the variety of its users/editors from which it
can then create action items to determine how it can increase the number of
users by going after specific market segments. This would not eliminate
the "anyone can edit" ethos, but could be a more effective means to
increasing users rather than appealing to a broad public.
Bob Kosovsky, Ph.D. -- Curator, Rare Books and Manuscripts,
Music Division, The New York Public Library for the Performing Arts
blog: http://www.nypl.org/blog/author/44 Twitter: @kos2
Listowner: OPERA-L ; SMT-ANNOUNCE ; SoundForge-users
- My opinions do not necessarily represent those of my institutions -
*Inspiring Lifelong Learning* | *Advancing Knowledge* | *Strengthening Our
Wiki-research-l mailing list
I think that the following call might be of interest to some members of the
list. Please, feel free to disseminate it (thanks!):
[image: Imágenes integradas 1] <http://swellrt.org/contest>
If you like to code, free software and to boost a decentralized Internet,
the SwellRT project invites you to participate in its
Free Software Contest
3000€, 2000€ and 1000€ prizes will be awarded to the best 3 projects
which use or improve SwellRT technology.
Find more info at:
[image: Imágenes integradas 2] <http://swellrt.org/>
SwellRT is a real-time decentralized storage platform
enabling real-time collaboration for Web applications.
with transparent conflict resolution (eventual consistency).
Changes are distributed in real-time to any user or app accessing the
SwellRT provides also out-of-the-box collaborative rich-text editing
for Web applications through an extensible text editor Web component and
SwellRT can be deployed as a decentralized network,
so shared objects can be stored and synced in different federated servers
in real time.
Samer | @sh3v3k <http://twitter.com/sh3v3k> | http://samer.hassan.name