Hi everybody,
as part of https://phabricator.wikimedia.org/T205846 we are going to ask to
all the stat1005's users to move to stat1007 during the next two weeks. The
deadline is November 14th, by which time ssh access to stat1005 will be
removed.
Background: on stat1005 we have a GPU (more details in
https://phabricator.wikimedia.org/T148843) that has been sitting there for
almost two years, and it would be great to try to make it work during the
next months. This effort will require a lot of tests/reboots/etc.. that can
of course impact ongoing work of all of you, so we prefer to move everybody
to another identical machine beforehand.
Please reach out to me or to the analytics team in T205846 or IRC
(#wikimedia-analytics on Freenode) if you have any
questions/doubts/blocker/etc.., we are not going to enforce the deadline if
anybody will raise concerns or blockers of course. It would be great to
move everybody by Nov 14th but we surely don't want to disrupt any ongoing
important work.
I am going to update the Wikitech documentation about stat1005 and stat1007
as soon as possible, for the moment keep in mind that stat1007 will take
over completely everything that stat1005 currently does.
I have already copied over all the stat1005 directories to stat1007, and
I'll periodically sync them during the following days. If you don't find
anything important, please add a note in T205846.
Thanks a lot and sorry for the trouble,
Luca (on behalf of the Analytics team)
Hi all,
as part of a larger project, we are running a small think-aloud study to
better understand how editors use current interface features and tools to
identify "suspicious" edits (either suspected vandalism or red flags for
bias, that kind of thing).
Just posting for comment at this point before we submit our IRB
documentation. Also happy to hear about existing research we may have
missed!
https://meta.wikimedia.org/wiki/Research:Understanding_Content_Review_Pract…
Andrea
--
:: Andrea Forte
:: Associate Professor
:: College of Computing and Informatics, Drexel University
:: http://www.andreaforte.net
Hi everyone,
We’re preparing for the October 2018 research newsletter and looking for contributors. Please take a look at https://etherpad.wikimedia.org/p/WRN201810 and add your name next to any paper you are interested in covering. Our target publication date is on October 28 UTC although actual publication might happen several days later. As usual, short notes and one-paragraph reviews are most welcome.
Highlights from this month:
- Deliberation and Resolution on Wikipedia: A Case Study of Requests for Comments
- Indigenizing Wikipedia: Student Accountability to Native American Authors on the World’s Largest Encyclopedia
- Population preferences through Wikipedia edits
- Schema Inference on Wikidata
- Studying the Effect of Network Position on Efficiency: : A Case of Affiliation Network Featured Article Promotion
- Volunteer Retention, Burnout and Dropout in Online Voluntary Organizations: Stress, Conflict and Retirement of Wikipedians
- Welcome' Changes? Descriptive and Injunctive Norms in a Wikipedia Sub-Community
- Wikidata: A New Paradigm of Human-Bot Collaboration?
- World Influence of Infectious Diseases from Wikipedia Network Analysis
Masssly, Tilman Bayer and Dario Taraborelli
[1] http://meta.wikimedia.org/wiki/Research:Newsletter
Hi all. Does anybody know any studies about mental health of participants
in collective, horizontal collaboration environments?
Thank you,
Juliana
--
www.domusaurea.org
I would like to share, for discussion, some knowledge representation ideas with respect to a URL-addressable predicate calculus.
In the following examples, we can use the prefix “mw” for “https://machine.wikipedia.org/” as per xmlns:mw="https://machine.wikipedia.org/" .
mw:P1
→ https://machine.wikipedia.org/P1
mw:P1(arg0, arg1, arg2)
→ https://machine.wikipedia.org/P1?A0=arg0&A1=arg1&A2=arg2
mw:P2
→ https://machine.wikipedia.org/P2
mw:P2<t0, t1, t2>
→ https://machine.wikipedia.org/P2?T0=t0&T1=t1&T2=t2
mw:P2<t0, t1, t2>(arg0, arg1, arg2)
→ https://machine.wikipedia.org/P2?T0=t0&T1=t1&T2=t2&A0=arg0&A1=arg1&A2=arg2
Some points:
1. There is a mapping between each predicate calculus expression and a URL.
2. Navigating to mapped-to URLs results in processing on servers, e.g. PHP scripts, which generates outputs.
3. The outputs vary per the content types requested via HTTP request headers.
4. The outputs may also vary per the languages requested via HTTP request headers.
5. Navigating to https://machine.wikipedia.org/P1 generates a definition for a predicate.
6. Navigating to https://machine.wikipedia.org/P2?T0=t0&T1=t1&T2=t2 generates a definition for a predicate after assigning values to the parameters T0, T1, T2. That is, a definition of a predicate is generated by a script, e.g. a PHP script, which may vary its output based on the values for T0, T1, T2.
7. The possible values for T0, T1, T2, A0, A1, A2 may be drawn from the same set. T0, T1, T2 need not be constrained to be types from a type system.
8. The values for T0, T1, T2, A0, A1, A2, that is t0, t1, t2, arg0, arg1, arg2, could also each resolve to URLs.
Best regards,
Adam Sobieski
http://www.phoster.com/contents/
Hello everyone,
The next Research Showcase will be live-streamed this Wednesday, October
17, 2018 at 11:30 AM (PST) 18:30 UTC.
YouTube stream: https://www.youtube.com/watch?v=UJrJLWuNvXo
As usual, you can join the conversation on IRC at #wikimedia-research. You
can also watch our past research showcases here: https://www.mediawiki.or
g/wiki/Wikimedia_Research/Showcase
This month's presentation:
*"Welcome" Changes? Descriptive and Injunctive Norms in a Wikipedia
Sub-Community*
*By Jonathan T. Morgan, Wikimedia Foundation and Anna Filippova, GitHub*
Open online communities rely on social norms for behavior regulation, group
cohesion, and sustainability. Research on the role of social norms online
has mainly focused on one source of influence at a time, making it
difficult to separate different normative influences and understand their
interactions. In this study, we use the Focus Theory to examine
interactions between several sources of normative influence in a Wikipedia
sub-community: local descriptive norms, local injunctive norms, and norms
imported from similar sub- communities. We find that exposure to injunctive
norms has a stronger effect than descriptive norms, that the likelihood of
performing a behavior is higher when both injunctive and descriptive norms
are congruent, and that conflicting social norms may negatively impact
pro-normative behavior. We contextualize these findings through member
interviews, and discuss their implications for both future research on
normative influence in online groups and the design of systems that support
open collaboration.
*The pipeline of online participation inequalities: The case of Wikipedia
Editing*
*By Aaron Shaw, Northwestern University and Eszter Hargittai, University of
Zurich*
Participatory platforms like the Wikimedia projects have unique potential
to facilitate more equitable knowledge production. However, digital
inequalities such as the Wikipedia gender gap undermine this democratizing
potential. In this talk, I present new research in which Eszter Hargittai
and I conceptualize a "pipeline" of online participation and model distinct
levels of awareness and behaviors necessary to become a contributor to the
participatory web. We test the theory in the case of Wikipedia editing,
using new survey data from a diverse, national sample of adult internet
users in the U.S.
The results show that Wikipedia participation consistently reflects
inequalities of education and internet experiences and skills. We find that
the gender gap only emerges later in the pipeline whereas gaps along racial
and socioeconomic lines explain variations earlier in the pipeline. Our
findings underscore the multidimensionality of digital inequalities and
suggest new pathways toward closing knowledge gaps by highlighting the
importance of education and Internet skills.
We conclude that future research and interventions to overcome digital
participation gaps should not focus exclusively on gender or class
differences in content creation, but expand to address multiple aspects of
digital inequality across pipelines of participation. In particular, when
it comes to overcoming gender gaps in the case of Wikipedia, our results
suggest that continued emphasis on recruiting female editors should include
efforts to disseminate the knowledge that Wikipedia can be edited. Our
findings support broader efforts to overcome knowledge- and skill-based
barriers to entry among potential contributors to the open web.
--
Janna Layton
Administrative Assistant - Audiences & Technology
Wikimedia Foundation
1 Montgomery St. Suite 1600
San Francisco, CA 94104
Forwarding a request for input.
Pine
( https://meta.wikimedia.org/wiki/User:Pine )
---------- Forwarded message ---------
From: Joe Sutherland <jsutherland(a)wikimedia.org>
Date: Fri, Oct 5, 2018 at 9:29 PM
Subject: [Analytics] Community health metrics kit: Input needed!
To: A mailing list for the Analytics Team at WMF and everybody who has an
interest in Wikipedia and analytics. <analytics(a)lists.wikimedia.org>
Hello everyone - apologies for cross-posting! *TL;DR*: We would like your
feedback on our Metrics Kit project. Please have a look and comment on
Meta-Wiki:
https://meta.wikimedia.org/wiki/Community_health_initiative/Metrics_kit
The Wikimedia Foundation's Trust and Safety team, in collaboration with the
Community Health Initiative, is working on a Metrics Kit designed to
measure the relative "health"[1] of various communities that make up the
Wikimedia movement:
https://meta.wikimedia.org/wiki/Community_health_initiative/Metrics_kit
The ultimate outcome will be a public suite of statistics and data looking
at various aspects of Wikimedia project communities. This could be used by
both community members to make decisions on their community direction and
Wikimedia Foundation staff to point anti-harassment tool development in the
right direction.
We have a set of metrics we are thinking about including in the kit,
ranging from the ratio of active users to active administrators,
administrator confidence levels, and off-wiki factors such as freedom to
participate. It's ambitious, and our methods of collecting such data will
vary.
Right now, we'd like to know:
* Which metrics make sense to collect? Which don't? What are we missing?
* Where would such a tool ideally be hosted? Where would you normally look
for statistics like these?
* We are aware of the overlap in scope between this and Wikistats <
https://stats.wikimedia.org/v2/#/all-projects> — how might these tools
coexist?
Your opinions will help to guide this project going forward. We'll be
reaching out at different stages of this project, so if you're interested
in direct messaging going forward, please feel free to indicate your
interest by signing up on the consultation page.
Looking forward to reading your thoughts.
best,
Joe
P.S.: Please feel free to CC me in conversations that might happen on this
list!
[1] What do we mean by "health"? There is no standard definition of what
makes a Wikimedia community "healthy", but there are many indicators that
highlight where a wiki is doing well, and where it could improve. This
project aims to provide a variety of useful data points that will inform
community decisions that will benefit from objective data.
--
*Joe Sutherland* (he/him or they/them)
Trust and Safety Specialist
Wikimedia Foundation
joesutherland.rocks
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
Forwarding some good news.
Pine
( https://meta.wikimedia.org/wiki/User:Pine )
---------- Forwarded message ---------
From: Markus Kroetzsch <markus.kroetzsch(a)tu-dresden.de>
Date: Sat, Oct 13, 2018 at 12:30 AM
Subject: [Wikidata] Passing on praise/ISWC trip report
To: Discussion list for the Wikidata project. <wikidata(a)lists.wikimedia.org>
Dear all,
I am happy to report that we have just won the Best Paper Award of the
In-Use track of this year's International Semantic Web Conference
(ISWC), for our description of the SPARQL/RDF technology use on Wikidata
[1]. I keep telling people here that the general awesomeness of Wikidata
is the work of many, and in particular of this great community of editors.
Overall, the year's ISWC here in Monterey, CA has surprised Denny and me
with the huge uptake that Wikidata gets by now in industry and academia
alike, which was a huge breakthrough over last year. An amazing array of
people are doing great work based on this data, and again I would like
to pass on all the thank you's I have heard over this week to all of you
working hard to make this happen. Users range from individual students
to major tech companies, and I hope there will be many contributions
flowing back to us through these stakeholders. We also have seen an
increasing amount of research being done using Wikidata for evaluation
and testing, again both in published works and in conversations with
people in small and big organisations.
Let me also congratulate Fariz Darari, who is not a stranger to this
list either, on receiving the Best Dissertation Award of the Semantic
Web Science Association for the research that is also behind some of the
tools he has been creating for Wikidata.
I gave another talk related to Wikidata's ontological modeling, which I
hope did not represent the situation all too wrongly [2] ;-). There will
be videos of this and the best paper presentation, and most other ISWC
talks on VideoLectures in the not-so-far future.
Finally, since our best paper is about the use of BlazeGraph as a
platform for queries, let me also mention that we have had a number of
productive meetings here to discuss the future of this great software
(you may know that there were some organisational changes to the team
developing this so far). There will be opportunity to contribute to this
open source project, either as a developer or in other ways, in the
future. Stay tuned for more information on this.
So, thanks again to everyone working towards the great success of this
project -- amazing work!
Greetings from Asilomar
Markus
[1] Stanislav Malyshev, Markus Krötzsch, Larry González, Julius Gonsior,
Adrian Bielefeldt: "Getting the Most out of Wikidata: Semantic
Technology Usage in Wikipedia’s Knowledge Graph"
Talk slides and paper online:
https://iccl.inf.tu-dresden.de/web/Inproceedings3044/en
[2] Markus Krötzsch
Ontological Modelling in Wikidata
Invited keynote at the 9th Workshop on Ontology Design and Patterns (WOP'18)
Slides: https://iccl.inf.tu-dresden.de/web/Misc3058/en
--
Prof. Dr. Markus Kroetzsch
Knowledge-Based Systems Group
Center for Advancing Electronics Dresden (cfaed)
Faculty of Computer Science
TU Dresden
+49 351 463 38486
https://kbs.inf.tu-dresden.de/
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi everybody,
the Analytics team is going to move the Oozie and Hive daemons from the
analytics1003 host to an-coord1001 (new host, hardware refresh) on Tuesday
Oct 9th at 10 AM CEST. This will require downtime for Oozie and Hive, so
some jobs might fail or not work at all during the maintenance. We have
allocated two hours for this procedure but it should require less time.
Tracking task: T205509
As always, please follow up with me or anybody in the analytics team for
clarifications and/or comments (via Phabricator or IRC Freenode
#wikimedia-analytics).
Thanks for the patience!
Luca (on behalf of the Analytics team)
Hi everyone,
I'm excited to share that our annual survey about Wikimedia communities is
now published!
This survey included 170 questions and reaches over 4,000 community
members across
four audiences: Contributors, Affiliate organizers, Program Organizers, and
Volunteer Developers. This survey helps us hear from the experience of
Wikimedians from across the movement so that teams are able to use
community feedback in their planning and their work. This survey also helps
us learn about long term changes in communities, such as community health
or demographics.
The report is available on meta:
https://meta.wikimedia.org/wiki/Community_Engagement_Insights/2018_Report
For this survey, we worked with 11 teams to develop the questions. Once the
results were analyzed, we spent time with each team to help them understand
their results. Most teams have already identified how they will use the
results to help improve their work to support you.
The report could be useful for your work in the Wikimedia movement as well!
What are you learning from the data? Take some time to read the report and
share your feedback on the talk pages. We have also published a blog that
you can read.[1]
We are hosting a livestream presentation[2] on September 20 at 1600 UTC.
Hope to see you there!
Feel free to email me directly with any questions.
All the best,
Edward
[1]
https://wikimediafoundation.org/2018/09/13/what-we-learned-surveying-4000-c…
[2] https://www.youtube.com/watch?v=qGQtWFP9Cjc
--
Edward Galvez
Evaluation Strategist, Surveys
Learning & Evaluation
Community Engagement
Wikimedia Foundation
--
Edward Galvez
Evaluation Strategist, Surveys
Learning & Evaluation
Community Engagement
Wikimedia Foundation