[Wikimedia-l] [Wikimedia Announcements] Wikimedia engineering report, November 2013

Guillaume Paumier gpaumier at wikimedia.org
Mon Dec 9 15:39:56 UTC 2013


The report covering Wikimedia engineering activities in November 2013 is
now available.

Wiki version:
Blog version:

We're also proposing a shorter, simpler and translatable version of this
report that does not assume specialized technical knowledge:

Below is the HTML text of the report.

As always, feedback is appreciated on the usefulness of the report and its
summary, and on how to improve them.


Major news in November include:

   - Beta Features<https://blog.wikimedia.org/2013/11/07/introducing-beta-features/>,
   a new way for users to try out new features on Wikipedia and other
   Wikimedia sites before they are released for everyone;
   - The launch of our search for a VP of

   - A retrospective by the Mobile engineering team on best practices for
working distributedly;
   - The activation of OAuth on Wikimedia
   which allows users to authorize third-party applications to take actions on
   their behalf without sharing their password;
   - A presentation of the "Wikidata concept

   - A retrospective on the ability to add musical
pages on Wikimedia sites.

*Note: We're also providing a shorter, simpler and translatable version of
this report
that does not assume specialized technical knowledge.*
Personnel Work with us <https://wikimediafoundation.org/wiki/Work_with_us>

Are you looking to work for Wikimedia? We have a lot of hiring coming up,
and we really love talking to active community members about these roles.

   - VP of Engineering <http://hire.jobvite.com/Jobvite/Job.aspx?j=ods8Xfwu>
   - Software Engineer -
   - Software Engineer - Core
   - Software Engineer - VisualEditor
   - Software Engineer - Language
   - Software Engineer <http://hire.jobvite.com/Jobvite/Job.aspx?j=o09WXfwM>
   - Senior Software Engineer - Team
   - QA Automation
   - Software Engineer Data Analytics (Back
   - Dev-Ops Engineer -
   - Graphic Design Interns -


   - Jeff Hall joined the Platform engineering group as a member of the QA
   team (announcement<http://lists.wikimedia.org/pipermail/qa/2013-November/000686.html>).

   - Aaron Arcos joined the Platform engineering group as a volunteer
   developer working with the Multimedia team

   - Dario Taraborelli was promoted to the position of Senior Research
   Scientist, Research and Data team lead.

   - Aaron Halfaker was promoted to the position of Research Scientist (

   - Moiz Syed joined the User Experience team as User Experience Designer (

Technical Operations

*Wikimedia Labs <https://www.mediawiki.org/wiki/Wikimedia_Labs>*
A new dynamic proxy system has been deployed on Labs; it allows the admin
of any project to arrange for public web access and a dedicated DNS
hostname for a project instance without requesting an IP address. Labs
staff and volunteers will now be reclaiming quite a few IPs as existing
projects migrate to a dynamic proxy setup. The WMF has hired a short-term
contractor, Mike Hoover, to assist with the migration of Labs
infrastructure from Tampa to our new datacenter in Ashburn. Mike has spent
a lot of time exploring the existing infrastructure and running test
setups; soon he will start to configure the new OpenStack nodes in
production. Andrew Bogott has been working on cleaning up stale and unused
resources. He's working on some automatic documentation that will help
users track the status of their projects and instances with an eye towards
predicting the impact of the coming migration. Labs suffered two brief
outages: a brief, self-inflicted network failure, and a longer outage
during which one of the virtualization hosts failed. Both outages were
swiftly resolved, but there's a bit of lag as some tools and services
failed to come back properly afterwards due to poor distribution of virtual
servers (inter alia, both the grid master and the shadow [backup] master
were on the same server). Preparation for the move to the Ashburn data
center is well in progress, with the new storage server being physically
configured this week as well as the new hardware servers for user databases
(including the new PostgreSQL instance intended for OpenStreetMap). Features
Engineering <https://www.mediawiki.org/wiki/Wikimedia_Features_engineering>
retention: Editing tools

*VisualEditor <https://www.mediawiki.org/wiki/VisualEditor>*
In November, the VisualEditor team continued to improve the stability and
performance of the system, and add new features. The deployed version of
the code was updated three times
1.23-wmf4 <https://www.mediawiki.org/wiki/MediaWiki_1.23/wmf4#VisualEditor>and
1.23-wmf5 <https://www.mediawiki.org/wiki/MediaWiki_1.23/wmf5#VisualEditor>).
Most of the team's focus was on fixing bugs, and on some major
infrastructure changes, splitting out the
OOJS-UI <https://www.mediawiki.org/wiki/Ooui> libraries from VisualEditor
to make them available to other teams. Much of the team travelled to the Open
Source Language
Pune, India to learn more about how to improve VisualEditor for a
variety of languages, scripts, users and systems. Two new members of the QA
team <https://www.mediawiki.org/wiki/Quality_Assurance/Browser_testing>joined
in to help improve VisualEditor – Jeff
Hall <https://www.mediawiki.org/wiki/User:JHall_%28WMF%29> and Rummana
Yasmeen <https://www.mediawiki.org/wiki/User:RYasmeen_%28WMF%29>, and
thanks to them, the automated browser tests have expanded in breadth and
depth of coverage. Work continued on major new features like full rich
copy-and-paste from external sources, a dialog for quickly adding citation
templated references, and a tool to insert characters not available on
users' keyboards. The editor was made available by default on just over 100
additional Wikipedias as part of the continuing roll-out. VisualEditor was
also enabled for opt-in testing on Swedish Wiktionary and Wikimedia
Sweden's wiki, the first time it has been available on a non-Wikipedia
production wiki.

*Parsoid <https://www.mediawiki.org/wiki/Parsoid>*
November saw the deployment of major changes to the DOM spec in
coordination with the VisualEditor
Link types are now marked up by semantics rather than syntax, interwiki
links are detected automatically, categories are marked as page properties and
During the deployment, we found that the newer libraries used by the web
service front-end were buggy. We reverted the library upgrade and
contributed fixes upstream. This incident prompted us to work on tests for
the HTTP web service to catch issues like this in continuous integration.

After these issues were sorted out, we continued with continuous
improvement and fixes. Editing support for magic words and categories was
improved, several dirty diff issues were fixed and the API was refined for
page-independent wt2html and html2wt conversion. See our deployment
page<https://www.mediawiki.org/wiki/Parsoid/Deployments>for details.
Cassandra load testing<https://www.mediawiki.org/wiki/User:GWicke/Notes/Storage/Cassandra_testing>for
the Rashomon storage service continued and uncovered several issues
that were reported back upstream. With Cassandra 2.0.3 the 2.0 branch is
now stabilizing in time to make deployment in December feasible. Cassandra
is now stable at extremely high write loads of around 900 revisions per
second, which is more than 10 times the load we experience in production.
Core Features

*Notifications <https://www.mediawiki.org/wiki/Echo_%28Notifications%29>*
In November, we deployed Notifications on the German and Italian
completing our worldwide release of this tool. Fabrice Florin, Denis
Barthel, Jan Eissfeldt, Erica Litrenta and Keegan Peterzell managed the
community outreach for these final releases, while Benny Situ oversaw the
technical deployments. Community response to Notifications has been
generally favorable on all wikis. While feature development has now ended
for this project, we expect new notifications and features to be developed
by other teams in coming months. To learn more, visit our project
read the help page <https://www.mediawiki.org/wiki/Help:Notifications> and
join the discussion on the talk

*Flow <https://www.mediawiki.org/wiki/Flow_Portal/Project_information>*
This month, the Flow team finished out the feature set for our minimum
viable product <https://en.wikipedia.org/wiki/Minimum_viable_product>. We
added watchlist integration, the ability to see board, topic, and post
histories, and did a first round of community feedback and testing with our
product to date. We also prepared for release to production wikis in
December by working on Operations and Security needs.

*Growth <https://www.mediawiki.org/wiki/Growth>*
In November, the Growth <https://www.mediawiki.org/wiki/Growth> team
primarily worked on refactoring the
GettingStarted <https://www.mediawiki.org/wiki/Extension:GettingStarted>extensions,
including development of an API for the latter. This public API
will be used by the Growth team, the Mobile team and others to deliver
editing tasks to users across a variety of Wikipedia interfaces.

The team also spent significant time on the research and design
preparations for its anonymous editor
article creation
This included participating in a community Request for Comment
about a potential Draft
namespace<https://www.mediawiki.org/wiki/Draft_namespace>for articles,
requirements gathering, and working on a Draft namespace
Matthew Flaschen and Pau Giner attended the Wikimedia Diversity
with Jared Zimmerman and Vibha Bamba) on how diversity related to
the team's engineering and product work.

*Wikipedia Education Program
This month, we improved a feature that was built in October (allowing
instructors to assign articles to student editors), completed a new feature
(allowing instructors to add users as students) and started another one
(displaying information about student editors' courses on
Special:Contributions). We fixed some bugs, and kept up with changes in
MediaWiki core. We also continued preliminary work—started last
month—towards renewing the UX and broadening the extension's scope.
Mobile <https://www.mediawiki.org/wiki/Wikimedia_Mobile_engineering>

*Wikipedia Zero <https://www.mediawiki.org/wiki/Wikipedia_Zero>*
During the last month, the team monitored the rollout of Wikipedia Zero via
text (USSD/SMS) in partnership with Airtel Kenya and Praekelt for the first
pilot of the program. Additionally, Yuri Astrakhan promoted the program
abroad. The team also prepared code and configuration for approval,
finalized IP addresses for zero-rating and deployed bugfixes for the
Wikipedia app for Firefox OS. We added support for simpler JSON in
configuration files, enhanced performance and redirect features and
constrained ZeroRatedMobileAccess extension loading to guard against
repeats of last month's configuration bug.

*Mobile web projects <https://www.mediawiki.org/wiki/Mobile_web_projects>*
The Onboarding A/B Test resulted in an Edit Guider, now available. The
overlay UI overhaul currently in beta is planned to become available on the
main site. User profiles intent is also in testing in beta.
Language Engineering<https://www.mediawiki.org/wiki/Wikimedia_Language_engineering>
highlights for this past month include a very successful Open Source
Language Summit<https://www.mediawiki.org/wiki/Language_portal/Pune_LanguageSummit_November_2013/Event_Report>in
Pune, India co-organized with Red Hat. More than 60 developers joined
to collaborate and work together on improving language support for
Wikipedia on the web and mobile. Work sprints on integration of input
methods in VisualEditor, Indic Fontbook specification, mobile input methods
and content translation were held. The team also fixed and deployed several
issues related to performance and saving preferences for the Universal
Language Selector (ULS). Other tasks completed include creating a class for
interlanguage links using where the Autonym font can be used only for
autonym items. The team also worked on collating documentation about all
initial inclusion requests for each web font served through ULS also
documented in the font.ini files of each font in the repository. Platform
Engineering <https://www.mediawiki.org/wiki/Wikimedia_Platform_Engineering>

*DevOps Sprint 2013 <https://www.mediawiki.org/wiki/DevOps_Sprint_2013>*
The DevOps sprint participants focused their efforts towards monitoring
related work, specifically getting Logstash in production and
puppetizing/migrating Graphite (both still in-progress). Cache related
fixes were made to avoid users seeing outdated version of pages when using
non canonical URL forms. A fix was made to the commons upload process to
update all articles that use that page as users would expect.

*Search <https://www.mediawiki.org/wiki/Search>*
Before November 18, we were spinning up an aggressive plan to add many new
wikis to CirrusSearch. On November 18, we had multiple incidents that
caused us to roll all wikis using CirrusSearch back to Lucene; we've spent
the rest of November implementing fixes for all issues discovered on the
18th. That is now done and we plan to switch all wikis that used to have
CirrusSearch back to running it as a secondary search engine on December 2.
We'll attempt to restart our aggressive plan as soon as we're comfortable
with it again.

*Site performance and architecture
We ran a controlled experiment to test the impact of module storage on
performance. We expect to publish our findings within a week. We puppetized
Graphite and MediaWiki's profiling log aggregator and migrated them to our
Ashburn data center. Finally, we started working on a replacement profiling
log aggregator that will process and visualize profiling data from both
client-side and server-side code.

*Auth systems <https://www.mediawiki.org/wiki/Auth_systems>*
Our preliminary version of OAuth is now live on all Wikimedia wikis. Since
the rollout, five OAuth consumers have been accepted. We're hopeful many
more consumers will be proposed.

*Wikimania Scholarships app
Work is progressing towards a planned launch of the application on
2013-12-19. The source code has been imported into an internal git
repository and is now being managed via gerrit. A bugzilla component has
been created under the Wikimedia product to track defects and feature
requests. Several changesets are in review to complete the basic
functionality of the application and prepare for an internal security

*Security auditing and response
We released a security update to MediaWiki to fix a number of issues in
core and extensions. Security reviews of Limn, GWTools and Flow extensions
are in progress.

*Admin tools development
This activity is still officially on hold. However, progress on the global
rename user tool continued, as well as implementing global
Quality assurance

*Quality Assurance <https://www.mediawiki.org/wiki/Quality_Assurance>*
November saw significant improvements to the QA
mediawiki.org contributed by both staff and volunteers. Participants
in the Google
Code-in <https://www.mediawiki.org/wiki/Google_Code-in> program made even
more contributions, to both documentation and browser test code. The QA
team welcomed new staff members Rummana Yasmeen and Jeff Hall, who made
immediate contributions to the VisualEditor project and to the browser test

*Beta cluster <https://www.mediawiki.org/wiki/Beta_cluster>*
In November, the Beta cluster saw greatly improved support for testing
Parsoid, the parsing engine behind VisualEditor. The Beta cluster also
continues to provide a real-world simulation for the Flow project in
advance of Flow's limited release scheduled for December. Beta continues to
be the the main test environment for MobileFrontend, CirrusSearch, and many
other Wikimedia software projects.

*Browser testing
In November, we added significant browser test coverage for the Flow
project, and the addition of Jeff Hall to WMF staff brought a focus to
testing VisualEditor. Browser tests now reside in ten different
repositories across WMF projects. November saw a increased browser test
coverage for the Language, VisualEditor, and Flow projects, among others.
The diversity of browser tests in project repositories has been a force
behind great improvements in infrastructure, with code shared among the
projects now residing in the repository at mediawiki/selenium.
Engineering Community

In November, the Engineering community team held their second monthly
as well as their quarterly
the July–September period.

*Bug management <https://www.mediawiki.org/wiki/Bug_management>*
Andre Klapper and Quim Gil prepared and organized Wikimedia's participation
in Google Code-In <https://www.mediawiki.org/wiki/Google_Code-in>. This
includes supporting mentors and students by writing documentation and
importing tasks. Code-related, Andre cleaned up Wikimedia Bugzilla's custom
CSS by removing 16 CSS
files<https://bugzilla.wikimedia.org/show_bug.cgi?id=54823>with 6
stay, prepared and tested patches for upgrading
Wikimedia Bugzilla
<https://bugzilla.wikimedia.org/show_bug.cgi?id=49597#c5>from version
4.2 to 4.4, updated the Greasemonkey
stock answers to ping assignees), and sync'ed the "WeeklyReport"
Bugzilla extension code <https://gerrit.wikimedia.org/r/#/c/96479/> with
upstream. WMF's Operations team installed new SSL
bugzilla.wikimedia.org. The "shellpolicy" keyword in Bugzilla was
"community-consensus-needed" and the "wikidata" keyword was
removed <https://bugzilla.wikimedia.org/show_bug.cgi?id=56417>.
Furthermore, Andre created a draft for a Bugzilla

*Mentorship programs <https://www.mediawiki.org/wiki/Mentorship_programs>*
We started successfully Wikimedia's first participation in Google
Six candidates were selected as new interns at the FOSS Outreach Program
for Women - Round

   - Anu G Enchackal <https://www.mediawiki.org/wiki/User:Inchikutty>
- UploadWizard:OSM
   Map Embedding<https://www.mediawiki.org/wiki/User:Inchikutty/UploadWizard_OSM_Map_Embedding>(mentored
by Gergő
   Tisza <https://www.mediawiki.org/wiki/User:Tgr>)
   - Diwanshi Pandey
<https://www.mediawiki.org/wiki/User:Diwanshipandey> - Complete
   the MediaWiki development course at
   Astrakhan <https://www.mediawiki.org/wiki/User:Yurik>)
   - Brena Monteiro
<https://www.mediawiki.org/wiki/User:Monteirobrena> - mediawiki.org
   homepage redesign
   Walls <https://www.mediawiki.org/wiki/User:Heatherawalls> and Quim

   - Be Birchall <https://www.mediawiki.org/wiki/User:5xbe> - Clean up
   Parsoid round-trip testing UI, including using a templating
   Ordinas <https://www.mediawiki.org/wiki/User:Marcoil> and Subramanya
   Sastry <https://www.mediawiki.org/wiki/User:Ssastry>)
   - Maria Pacana <https://www.mediawiki.org/wiki/User:Mariapacana> - Clean
   up tracing/debugging/logging inside
   Sastry <https://www.mediawiki.org/wiki/User:Ssastry> and Arlo

   - Niharika Kohli <https://www.mediawiki.org/wiki/User:Niharika> - Compact
   interlanguage links as a beta
Ghoshal and Pau Giner)

We also confirmed the participation of Wikimedia in the Facebook Open
Academy <https://www.mediawiki.org/wiki/Facebook_Open_Academy> program.

*Technical communications
In November, Guillaume Paumier <https://www.mediawiki.org/wiki/User:Guillom>'s
primary focus was on preparing for the Google
Code-in<https://www.mediawiki.org/wiki/Google_Code-in>program, and
mentoring students once the program started. In 2 weeks, 18
on writing discovery
reports <https://www.mediawiki.org/wiki/Category:Discovery_reports> (candid
essays from the perspective of newcomers to the Wikimedia technical
community); among them, seven completed their task successfully. Guillaume
also assembled and published the weekly technical
newsletter<https://meta.wikimedia.org/wiki/Tech/News>and provided
ongoing communications
the engineering staff.

*Volunteer coordination and outreach
Erik Moeller's talk "The Wikipedia stack" was accepted for the main track
session at FOSDEM <https://www.mediawiki.org/wiki/Events/FOSDEM>. The call
for proposals for the Wikis devroom at
FOSDEM<https://www.mediawiki.org/wiki/Events/FOSDEM>was extended until
December 15. Wikimedia applied for a stand. A Request
for Proposals for a technical writer
also sent. Last, we helped establishing a routine around Architecture
meetings <https://www.mediawiki.org/wiki/Architecture_meetings>.

*Multimedia <https://www.mediawiki.org/wiki/Multimedia>*
In November, Mark Holmquist and Gergő Tisza developed a second beta version
of the Media Viewer<https://www.mediawiki.org/wiki/Multimedia/About_Media_Viewer>,
based on new designs by Pau Giner. For a more immersive experience, this next
larger images, as
shown in the demo <https://www.mediawiki.org/wiki/Lightbox_demo>.

We also released Beta
Features<https://www.mediawiki.org/wiki/About_Beta_Features>on all
Wikimedia wikis, where it is already used by thousands of users.
This experimental program invites users to try out new features before they
are released widely, then give feedback to developers. To use Beta
Features, click on the small 'Beta' link next to your 'Preferences' on your
site, or test the latest version on

Fabrice Florin managed product development, led the creation of the Multimedia
Vision 2016 <https://www.mediawiki.org/wiki/File:Multimedia_Vision_2016.pdf>(with
Pau Giner), hosted roundtable
discussions <https://meta.wikimedia.org/wiki/Roundtables/Roundtable_4> and
updated the team's multimedia
based on community and team feedback.

Bryan Davis, Aaron Schulz and Chris Steipp reviewed new code for the
upcoming GLAM Toolset<https://commons.wikimedia.org/wiki/Commons:GLAMToolset_project#Goal_1:_GLAM_Upload_System>for
batch uploads by museum curators. We also welcomed Aaron Arcos as
volunteer software engineer, who is joining our multimedia
team<https://www.mediawiki.org/wiki/Multimedia>full-time through
Spring 2014.
To discuss these features and keep up with our work, we invite you to join
the multimedia mailing
We are also recruiting for a senior software
on our team.
Analytics <https://www.mediawiki.org/wiki/Analytics>

*Kraken <https://www.mediawiki.org/wiki/Analytics/Kraken>*
We continued to make progress on event delivery via Kafka. We identified
and tested solutions for issues encountered with event delivery from the
Amsterdam data center. We also tested solutions to fix Ganglia logging

*Wikimetrics <https://www.mediawiki.org/wiki/Analytics/Wikimetrics>*
We concluded Phase 1 of Wikimetrics, by implementing asynchronous cohort
validation, editor survivor and threshold metrics.

*Data Quality <https://www.mediawiki.org/wiki/Analytics/Data_Quality>*
We identified issues with over-counting page views, and deployed a fix in
November. Data from July onward were restated.

*Research and Data
This month, we started work on metrics
one of the team's quarterly goals. We published a number of supportive
analyses of new user
activation <https://meta.wikimedia.org/wiki/Research:New_editor> and
retention <https://meta.wikimedia.org/wiki/Research:Surviving_user> as well
as "active editors"<https://meta.wikimedia.org/wiki/Research:Refining_the_definition_of_monthly_active_editors>to
assess issues and potential benefits of new definitions. The outcome
this analysis will inform design decisions for new dashboards focused on
editor engagement.

In collaboration with the Platform team, we ran an A/B test to determine
performance gains of localStorage. The
that the use of localStorage significantly improves the site's
performance for the end user: Module storage is faster. Readers whose pages
load slower tend to browse less. Mobile browsers don't seem to benefit
substantially from caching.

We published the results of a test designed to explore if displaying a
short tutorial could improve the first-edit completion rate of
newly-registered users on mobile devices. The
the hypothesis, indicating that edit guiders are a good onboarding
strategy for new mobile users.

We ran an analysis of anonymous editor
background research for new onboarding strategies designed by the
team and found that editors who edit as an IP right before registering an
account are our most productive

On November 9, 2013 we hosted the inaugural Labs2 Wiki Research
it was the first in a series of global events meant to "facilitate problem
solving, discovery and innovation with the use of open data and open-source
tools" (read the full
Highlights from the event are available in the latest issue of the Research
We are planning to host a new hackathon in Spring 2014 and we are actively
seeking <wrh at wikimedia.org> volunteers to host local and virtual meetups.
Kiwix <http://www.kiwix.org>

*The Kiwix project is funded and executed by Wikimedia CH
We have released two new versions of Kiwix for
month (1.5 & 1.6), providing many new features; most of them were
developed by young new developers as part of the Google Code-in
We have also released a new and unique tool to easily create ZIM file
yourself <https://sourceforge.net/p/kiwix/mailman/message/31653271/> from
data on your hard drive; the tool is stable and can now be used. Work
continues around tools based on Parsoid output, especially as we need to
rewrite the ZIM-related code for the Mediawiki offline
currently under heavy re-engineering.

*The Wikidata project is funded and executed by Wikimedia Deutschland
Wikidata developers held an office hour to give a status update and answer
questions (read the
In addition, they worked on ranks, ordering of statements and the
quantities datatype. The quantities datatype is needed, for example, to
enter the number of inhabitants of a country in Wikidata. It is available
for testing now on http://test.wikidata.org. Ranks will allow for certain
statements to be marked as preferred or deprecated. This is for example
useful to indicate a previous mayor of a city, or the number of inhabitants
of a country in 1900. Magnus Manske wrote a
gadget<http://magnusmanske.de/wordpress/?p=108>that allows you to
additionally show Wikidata search results when doing a
search on Wikipedia. He also extended the Reasonator tool to now also work
for cities <https://tools.wmflabs.org/reasonator/?q=Q1040>. Until now, it
only supported people and species. Future The engineering management team
continues to update the *Deployments
<https://wikitech.wikimedia.org/wiki/Deployments>* page weekly, providing
up-to-date information on the upcoming deployments to Wikimedia sites, as
well as the *annual goals
listing ongoing and future Wikimedia engineering efforts.
Guillaume Paumier
Technical Communications Manager — Wikimedia Foundation
