The report covering Wikimedia engineering activities in April 2014 is now
We're also proposing a shorter, simpler and translatable version of this
report that does not assume specialized technical knowledge:
Below is the HTML text of the report.
As always, feedback is appreciated on the usefulness of the report and its
summary, and on how to improve them.
Major news in April include:
- the change of
MediaWiki localization files from PHP to JSON, and the associated
of the LocalisationUpdate
- the move of Wikimedia
a new data center;
- the “Heartbleed” security
how the Wikimedia Foundation’s team responded to it;
- an explanation of how the Mobile team uses
plan their development sprints;
- a project report on a grant to create “gadgets” for
*Note: We’re also providing a shorter, simpler and translatable version of
does not assume specialized technical knowledge.*
Engineering metrics in April:
- 158 unique committers contributed patchsets of code to MediaWiki.
- The total number of unresolved
from around 1315 to about 1305.
- About 30 shell requests
- 1.1 Work with
- 2 Technical
- 3 Features
- 3.1 Editor retention: Editing
- 3.2 Core
- 5 Language
- 6 Platform
- 6.1 MediaWiki
- 6.2 Quality
- 6.4 Engineering Community
Personnel Work with us <https://wikimediafoundation.org/wiki/Work_with_us>
Are you looking to work for Wikimedia? We have a lot of hiring coming up,
and we really love talking to active community members about these roles.
- VP of
- Software Engineer – VisualEditor
- Software Engineer – Fundraising
- Software Engineer –
- Software Engineer
- Research Analyst –
- Product Manager – Language
- Operations Security
- Data Center Engineer
- Giuseppe Lavagetto joined the Operations team as Operations Engineer (
- Aaron Schulz is now Senior Performance Engineer
- Dmitry Brant joined the the Mobile App Team as Software Developer (
- Danny Horn joined the Product Development team as Product Manager (
*Datacenter RFP <https://wikimediafoundation.org/wiki/RFP/2013_Datacenter>*
The Wikimedia Foundation has chosen a winning RFP bid, a contract has been
executed and implementation is underway. A public announcement is being
prepared in the upcoming week.
Labs metrics in April:
- Number of projects: 153
- Number of instances: 345
- Amount of RAM in use (in MBs): 1,454,592
- Amount of allocated storage (in GBs): 16,515
- Number of virtual CPUs in use: 716
- Number of users: 3,064
The migration of Labs and Tool
the Ashburn data center is complete, and most of the hardware in Tampa has
been shut down and packed up for shipping to the new (to be announced) data
center. Post-migration, many projects which had public IPs are now relying
on the internal Labs web proxy instead. That has caused a few unexpected
bugs in project web access, but provides several benefits including HTTPS
access and increased user data privacy.
*Tampa data center*
During the last month, our data center footprint in Tampa has been reduced
to just 6 racks, reduced from 24 total. A copy of all essential data
remains present in the Tampa facility until we’ve finished setting up the
relevant services in our upcoming new data center.
retention: Editing tools
In April, the VisualEditor team worked to improve the stability of the
editor, adding some new features and improving usability so that users can
create and edit pages more swiftly and intuitively with VisualEditor than
before. Template editing was over-hauled to make adding parameters less
busy, showing only a few parameters at first rather than all possible ones,
which can number in the dozens or more, especially in the case of some
often-used templates like those for citations or infoboxes. Setting the
size of images was tweaked to give a more natural set of controls based on
feedback from users. The page settings dialog had a number of minor tweaks,
leading to the set of options that can be modified inside VisualEditor
being completed. VisualEditor’s edit tab is now more consistent with the
rest of the MediaWiki interface in a number of noticeable if minor ways,
such as on pages to do with the Education Program, on file pages which are
hosted on Commons rather than on the local wiki, or on very narrow screens.
User testing was carried out on the forthcoming citation dialog and some
final simplifications were made, such as adding suggested as well as
required parameters, ahead of its pending introduction. Finally, a careful
audit of all Wikimedia wikis led to fixing broken local community-written
code, to ensure that VisualEditor runs on all of them. The deployed version
of the code was updated four times in the regular release cycle
In April, the Parsoid team continued to fix bugs and tweak code. Two areas
in particular received a lot of attention: template encapsulation and link
handling. We ironed out a whole bunch of edge case handling in template
encapsulation code and its interaction with fostered content from tables
(caused by misnested tags in tables). We also fixed many unhandled
scenarios and edge cases parsing and serializing links. In addition to bug
fixes, we also improved the performance of the parsing pipeline; some pages
like *Barack Obama <https://en.wikipedia.org/wiki/Barack_Obama>* should now
parse 30% faster than before. We continued migrating our debugging and
tracing code to use our new logger. April also saw additional progress
providing support for visual editing of transclusion parameters; this
should land on master soon.
This month, the Flow team focused on back-end changes to improve moderation
templating to make Flow more responsive and easier to add new features
onto. On the user-facing features side, we released the ability to close
and summarize topics. This will allow users to manage active discussions
and end ones that have come to a resolution. Flow is now the default
discussion experience for many Beta Features discussions on mediawiki.org
and the team is accepting requests to enable Flow on more pages on that
wiki for the purpose of testing complex multi-user discussion interactions.
In April, Growth switched gears to focus on a new experimental area: anonymous
The team prepared its first two experimental interface changes, aimed at asking
anonymous editors to register
to be launched in early May). The team also will be conducting basic
research into the role anonymous editors play in Wikipedia − more at
and Research:Anonymous editor
*Wikipedia Education Program
This month we deployed several bug fixes, including disabling the
malfunctioning and little-used *student profiles*feature and setting a
sensible default end date for new courses. Thanks to volunteer Tony Thomas,
the extension-related preferences were moved into the Appearance tab of
Preferences. Progress toward several other improvements was made in April:
Sage Ross began implementing an API to generate lists of enrolled student
editors from one or more courses, and the Facebook Open Academy students
continued their work on new notification features, and also embarked on
need-finding research for an improved course activity feed.
*Wikimedia Apps <https://www.mediawiki.org/wiki/Wikimedia_Apps>*
The Mobile App team continued moving toward the first market release of the
rebooted Wikipedia App for Android and iOS. The team focused on bug fixes,
editing refinements, and UX polish. Several issues related to keyboard,
navigation bar, edit summary, and abuse filter were fixed. The app now uses
the newly created Wikifont which reduced the size of the app and the number
of graphical assets. Articles should now look even closer to their mobile
web counterparts. Product management switched from Kenan Wang to Maryana
Pinchuk due to Kenan’s departure, and the team welcomed Dmitry Brant as an
Android software developer.
*Mobile web projects <https://www.mediawiki.org/wiki/Mobile_web_projects>*
This month, the mobile web team released history and contributions pages,
as well as an updated watchlist view, for all users. We also promoted two
new features geared toward “humanizing” Wikipedia for readers and new
editors: a prominent “last modified” banner that indicates when articles
haven’t been edited in a while and may need some attention, and a user
profile feature to provide a mobile-friendly snapshot of users’
contributions and activity. For tablets, we updated typography and layout
and worked on adding the ability to add and modify links via VisualEditor
in beta, in preparation for redirecting tablets to the mobile site later
*Wikipedia Zero <https://www.mediawiki.org/wiki/Wikipedia_Zero>*
Presentation slides from the monthly metrics meeting
During the last month, the team continued setup tasks on the Partners
portal, JSON configuration store, and graceful image quality reduction. The
team also updated Android and iOS Wikipedia app reboot visual flourishes
for Wikipedia Zero, analyzed anomalous access patterns and proxy-oriented
configuration and tech documentation to close gaps, and created bugfixes
for unnecessary charge warnings in the “Read in another language” language
picker plus direct upload.wikimedia.org
image hyperlinks on File: pages.
The team also removed some legacy ETL code from the ZeroRatedMobileAccess
Yuri did outreach abroad and continued analytics work on SMS/USSD pilot
data. The team also generated two custom pageview analyses for an operator
to distinguish traffic by high level device access characteristics as part
of ongoing discussions. The team also explored legacy Android Wikipedia app
Additionally, the team cut Android Wikipedia app alpha builds, worked on
User-Agent string and URL format updates for the forthcoming iOS Wikipedia
app to ensure pageview logging, and performed app code review.
Discussion with the community on MCC-MNC logging to address mobile IP drift
was conducted, and it appears it is okay to proceed; the team will reduce
the date granularity of log lines to the day (e.g., YYYYMMDD) with a patch
to MediaWiki core, though.
Routine pre- and post-launch configuration changes were made to support
operator zero-rating, and in-depth technical assistance was provided to
operators and the partner management team to help add zero-rating and
The team emailed further about full-text search in reboots of Wikipedia
apps, and may resume investigation of it later.
The team also examined requirements for portal and general partners
engineering human resources.
*Wikipedia Zero (partnerships)*
IPKO in Kosovo launched Wikipedia Zero, bringing us to a total of 28
partners in 26 countries. We delivered 68 million free page views in April.
Adele Vrana visited South Africa to meet with MTN (current Wikipedia Zero
partner), prospective partners, members of Wikimedia South Africa (WMZA)
and the Singenjongo High School. This trip was part of a broader strategy
to promote Wikipedia in our partners’ corporate social responsibility (CSR)
and education initiatives, increasing awareness and impact locally. We are
identifying new collaboration opportunities with MTN and local
organizations, including the Wikimedia chapter in South Africa and other
mission-aligned nonprofits. Additionally, we will continue to support the
local initiative created by Sinenjongo High School teachers and students.
*Language tools <https://www.mediawiki.org/wiki/Language_tools>*
The team prepared the migration of the translation memory infrastructure
from Solr to ElasticSearch.
The jquery.webfonts library was adapted for the Typography refresh. An
input method for the Batak script was added to jquery.ime, and bugs were
fixed in the InScript input method for Hindi, Odia and Gujarati.
*Language Engineering Communications and Outreach
The team prepared targeted mini-surveys for readers of Wikipedia in
poorly-supported languages, and heldIRC office
*Content translation <https://www.mediawiki.org/wiki/Content_translation>*
ContentTranslation was the team’s main effort this month. Source text
segmentation was further improved and stabilized. Other developed features
- A beta feature that shows a red interlanguage link when the article is
not translated to the user’s language;
- Basic handling of templates and images;
- Basic publishing of the translation as a formatted article;
- Testing infrastructure for the server.
Work on the Zend plugin compatibility layer is feature complete, and now
the team is working on proper packaging of HHVM, and is working toward
making HHVM the default PHP implementation on the Beta cluster.
*Release & QA
Presentation slides of the Release engineering and QA quarterly review. See
The tool that deploys code in production (“scap”) is now used to
deploy/update code on the Beta
removing another difference between the Beta cluster and production along
with providing us an environment to safely test changes to our deployment
system(s). We converted more scap code to python (*scap-rebuild-cdbs* and
*mw-update-l10n*), and moved a ton of Jenkins jobs from the Cloudbees
Jenkins to our self-hosted instance; we’re on target to end the use of
Cloudbees’ Jenkins in the next two weeks. We also made significant progress
on the two open positions (Release Engineer and QA Automation).
We deployed Cirrus as a Beta Feature on all wikis that didn’t yet have it.
We’re working on deploying a change to how snippets are generated that
should be faster and better. We’re also starting to work with Elasticsearch
plugins for improved analysis of some languages as well as backup.
*Auth systems <https://www.mediawiki.org/wiki/Auth_systems>*
We did initial work on Authn/z requirements for RFC architecture, and an
initial review of Requests for
We also investigated the use of MediaWiki’s OAuth for Phabricator, and
worked on a proof of concept.
*Wikimania Scholarships app
Several small bug fixes and feature requests were worked on by volunteers
applying for GSoC <https://www.mediawiki.org/wiki/GSoC> projects. No
operational issues were reported.
*Deployment tooling <https://www.mediawiki.org/wiki/Deployment_tooling>*
We started investigating changes that may be needed to support the use of
HHVM in production.
*Security auditing and response
We helped with the operational response to the Heartbleed vulnerability.
Significant work was done on identifying and testing static analysis tools
to integrate into the release workflow. We finished reviewing varnishkafka
for Analytics, and Compact Personal Bar for UX. MediaWiki releases 1.21.9
and 1.22.6 fixed one security issue.
*Quality Assurance <https://www.mediawiki.org/wiki/Quality_Assurance>*
This month saw the QA team working closely with the MobileFrontend team to
extend and refactor their test suite. We also made great progress in
running many of the browser test suites on headless Firefox instances in
builds controlled by WMF Jenkins. Work on the WMF Jenkins browser test
builds will continue in order to take advantage of the power and
flexibility we have there.
The QA team released a number of new browser test features, including the
ability to create test data in the target wiki at runtime. This feature was
immediately put into use by the MobileFrontend team in their browser test
suite. Acomplete list of shared
to any browser tests in any extension repository is available.
In April, the multimedia team released Media Viewer
14 pilot sites, in preparation for a wider
month: overall response has been favorable so far, and a growing majority
finding this new multimedia browser useful. Gilles Dubuc, Mark Holmquist,
Gergő Tisza and Aaron Arcos developed final features for this release, as
described on this release’s
based on designs by Pau Giner. We also developed a set of metrics
track global activity, image load and network performance, as well as local
metrics dashboards <https://www.mediawiki.org/wiki/Multimedia/Metrics> for
selected sites: first results show a decline in image load
and suggest that Media Viewer loadsfaster than file description
We invite you to test the latest
Media Viewer (see these testing
and share your
Fabrice Florin led product planning and management, hosting a planning
our next development cycle (leading to a wall of
for the next six weeks, we plan to divide our time between Media Viewer
(e.g. serious bugs, basic zoom feature), Technical Debt (e.g. image
scalers) and Upload Wizard. Keegan Peterzell and Fabrice announced the gradual
release of Media
dozens of wiki sites, starting new discussions in collaboration with our
community partners, as well as launching surveys in multiple
get reader feedback about this tool. For more updates about our multimedia
work, we invite you to join the multimedia mailing
*Bug management <https://www.mediawiki.org/wiki/Bug_management>*
Daniel Zahn and Andre Klapper upgraded Wikimedia
the latest version 4.4.4. Valhallasw replaced the brittle wikibugs IRC
notification bot by pywikibugs <https://github.com/valhallasw/pywikibugs> (
A bugday <https://www.mediawiki.org/wiki/Bug_management/Triage/20140429> took
place updating about 50 MediaWiki General/Unknown tickets. Bugzilla’s
“Tools” product was renamed to
decrease confusion with tools on Tool Labs. Numerous old forgotten
“Backport_WMF?” flags on bug reports, older PATCH_TO_REVIEW tickets with
all patches merged, and a lot of older WikiEditor tickets were cleaned up.
In general, work mostly concentrated on handling the Phabricator
*Project management tools review
On April 14, the Request for
finalized and a related
The RfC was announced on
lists, asking everybody for feedback on the RfC itself, discussion on its
talk page, actively trying out proposed “Phabricator” in a testing
and creating tickets in the test instance under the “Wikimedia Phabricator”
project for functionality that is missing.
*Mentorship programs <https://www.mediawiki.org/wiki/Mentorship_programs>*
Sixteen Google Summer of
and seven FOSS Outreach Program for
will be busy in the next months working on Wikimedia projects. We got 23
participants in total, two more than a year ago, even if our quality
criteria have been more strict this time.
In addition to ongoing communications
the engineering staff, Guillaume
and archived many team documentation
the Engineering Community Team, like planning pages, reports and meeting
notes. He also set up subscription templates on
the English Wikipedia<https://en.wikipedia.org/wiki/Template:Latest_tech_news>
display the latest version of the technical
users who prefer not to get it delivered on their talk page.
*Volunteer coordination and outreach
We restarted the Wikimedia Tech Talks with a light process for scheduling
and we help organizing *A preliminary look at Parsoid
and *Unit testing for MediaWiki
The Wikimedia Hackathon in
ready to roll on May 9−11, and we co-hosted and info session with Wikimedia
the main organizers of the event.
*Architecture and Requests for comment process
We held several Request for Comment review meetings in IRC:
- on scoped language converter & square bounding
- to get quick next actions & validity checks on multiple
- on reducing image quality for
- on associated
- on MediaWiki libraries and third-party
Also, we worked on improving security, architecture, and performance
guidelines for developers. We aim to have MediaWiki performance guidelines
ready to approve at the Zürich hackathon in mid-May.
New features have been
Wikimetrics: Scheduled Reports & Public Reports.
Data from text Varnishes is now being consumed through varnishkafka ->
kafka -> camus into hdfs. Kafka now processes Bits, Images and Text data.
*The Kiwix project is funded and executed by Wikimedia CH
We have finally released a first experimental ZIM file of TED talks. We
have packed in a unique 7GB library the 250 talks about business. This
includes not only the videos, but also short speaker bios and thousands of
subtitles in more than 50 languages. We will fix soon the last critical
issues and release other TED ZIM files with talks about entertainment,
science, etc. We have also migrated our download server to a better one.
Besides providing a better storage system and more bandwidth, it has 9TB of
disk space. This was a mandatory step in our ZIM generation industrializing
process and therefore necessary to allow us to generate more ZIM files more
often. We are also working with an e-reader manufacturer to have Kiwix
installed and available on its devices so that
(eBooks for Mali <https://en.wikipedia.org/wiki/Mali>) can ship e-readers
with not only thousands of free eBooks, but also the complete Wikipedia and
Wikisource in French.
Presentation slides on Wikidata from the monthly metrics and activities
*The Wikidata project is funded and executed by Wikimedia Deutschland
The Wikidata team got simple query functionality ready for a first demo at
the WMF Metrics and Activities
The entity suggester a team of students is working on also got finishing
touches and should be ready for release soon. Once it is live, it will
suggest missing information on an item so it is easier to see what should
be added. We also welcomed 2
part of the Outreach Program for Women to help with documentation, social
media outreach and mobile app concepts. Wikiquote now manages its language
links via Wikidata. Additionally it is now possible to automatically add
other sister projects in the sidebar of an article using Wikidata. A Wikidata
released, as well as a
lets you use Wikidata’s labels for translation anywhere on the web.
FutureThe engineering management team continues to update the *Deployments
<https://wikitech.wikimedia.org/wiki/Deployments>* page weekly, providing
up-to-date information on the upcoming deployments to Wikimedia sites, as
well as the *annual goals
listing ongoing and future Wikimedia engineering efforts.
*This article was written collaboratively by Wikimedia engineers and
managers. See revision history
associated status pages. A wiki version
Technical Communications Manager — Wikimedia Foundation