[Wikimedia-l] [Wikimedia Announcements] Wikimedia engineering April 2013 report
gpaumier at wikimedia.org
Thu May 2 13:59:31 UTC 2013
The report covering Wikimedia engineering activities in April 2013 is now
We're also proposing a shorter, simpler and translatable version of this
report that does not assume specialized technical knowledge:
Below is the full HTML text of the report.
As always, feedback is appreciated on the usefulness of the report and its
summary, and on how to improve them.
Major news in April include:
- the start of recruitment for a multimedia engineering
- the deployment of a better translation
- the release of Kiwix of
an app to download and view Wikimedia content offline;
- the milestone of 500 million monthly unique
- improvements to the main page of
- the migration of Wikidata, and the English and German Wikipedias,
- the second phase of
whose structured data can now be displayed in Wikipedia articles;
- the deployment of VisualEditor's alpha
14 more language versions of Wikipedia;
- a proposed replacement for the login and account creation
- the launch of the Language Mavens
an advocacy and advisory body in the domain of language engineering;
- the ramp-up of technical mentorship
- the launch of an official Wikimedia Commons app for iOS and
*Note: We're also providing a shorter, simpler and translatable version of
does not assume specialized technical knowledge.
Personnel Work with us <https://wikimediafoundation.org/wiki/Work_with_us>
Are you looking to work for Wikimedia? We have a lot of hiring coming up,
and we really love talking to active community members about these roles.
- Director of Analytics<http://hire.jobvite.com/Jobvite/Job.aspx?j=oJriXfw9>
- Software Engineer -
- Software Engineer -
- Software Engineer - Language
- Software Engineer -
- Software Engineer - Multimedia
- Software Engineer - Multimedia User
- Software Engineer -
- Product Manager -
- UX Designer <http://hire.jobvite.com/Jobvite/Job.aspx?j=onImXfw8>
- Dev-Ops Engineer - SRE<http://hire.jobvite.com/Jobvite/Job.aspx?j=ocLCWfwf>
- MySQL Database
- Director of Technical
- Monte Hurd joined the Mobile engineering group as Software Engineer in
the Apps team
- Brandon Black joined the Operations team as Dev/Ops Engineer (
- Erik Bernhardson joined the Features team as Features Engineer (
- Nischay Nahata joined the Features team as Features Contractor (
Several large wikis were migrated to
with positive results. A new class of redis servers were deployed in
support of the migration of our asynchronous job queuing infrastructure
from MySQL to redis, enabling us to better meet the demands of Wikidata and
Echo. New file uploads are now being written to Ceph in Eqiad, in addition
to Swift in pmtpa, in support of a potential migration. The current plan is
to open up the Eqiad Ceph cluster for 'reads' the second week of May.
Currently 'reads' are served by the Tampa Swift cluster.With the core
cluster migrated to Eqiad, we are now working on the miscellaneous server
cluster <https://wikitech.wikimedia.org/wiki/Tampa_cluster>. As part of the
cleanup, we retired servers as well.
*Data Dumps <https://www.mediawiki.org/wiki/WMF_Projects/Data_Dumps>*
import of partial or full content into a new wiki have been released.
step-by-step walkthrough of their use has been added to the
users of the dumps on Meta. True
incremental dumps are now a GSOC
several students have applied for this project. The
logging table XML dump on Wikidata was taking days to run, due in part to
the high volume of edits there, much more than even the English Wikipedia.
Most of those edits wind up being recorded as autopatrol in the log, making
it already about half the size of the logging table for the English
Wikipedia. Breaking up the database query into smaller batches works around
*Wikimedia Labs <https://www.mediawiki.org/wiki/Wikimedia_Labs>*
Work on tool labs is progressing nicely. 32 bots/tools have been added to
the tools project. Most of the functionality of Toolserver should now be
available in tool labs. Database replication is still being worked on, but
is progressing well. The pre-labs replication databases are being
replicated to, and the Redactatron application has been finished, allowing
us to mark tables as ok to replicate. Our current roadmap is for database
replication to be accessible by the time of the Amsterdam Hackathon.
Instance creation performance greatly improved this month by replacing the
generic Ubuntu cloud images with our own custom images that pre-installs
and pre-configures most of what an initial puppet run would handle. Work on
single-instance MediaWiki continued this month, making the initial
MediaWiki installation more robust and handling a number of legal issues
OpenStackManager interface. Currently changes are in for reboot and get
console output actions for managing instances. A more reasonable project
filter change using jQuery Chosen has been added as well. Work on replacing
glusterfs is mostly done. Two projects have been switched to use the new
NFS server and the rest will be switched next month. Work has begun on
upgrading OpenStack from the essex to the folsom release. Our testing
environment has been upgraded and production tests are currently ongoing.
During the OpenStack summit, work was done to push the Moniker DNS
application into OpenStack incubation to be added as a supported OpenStack
project. Ryan Lane gave a
the OpenStack summit about the state of OpenStack's user committee,
along with Tim Bell of CERN and JC Martin of eBay. Work on the user
committee is in hopes of making OpenStack easier to use an upgrade, which
should increase the frequency of updates in Labs. Features
retention: Editing tools
In April, the team continued their work on the major new features that will
be added in the coming months. Our objective is for VisualEditor to be the
default editor for all Wikipedia users, capable of letting them edit the
majority of content without needing to use the wikitext editor, in July
2013. This means we have been focussed on four substantial areas of work:
adding support for references, templates, categories and media items.
During this time the main area of our work was editing around images, which
is now designed and partially implemented in our experimental code, and
around categories, which is almost complete and nearly ready for
deployment. The deployed alpha version of VisualEditor was updated thrice (
adding speed improvements, user interface improvements and work on the
back-end to better support the new features, and fixing a number of bugs.
We also were able to deploy the VisualEditor to fourteen more Wikipedias as
an opt-in alpha<https://blog.wikimedia.org/2013/04/25/visualeditor-alpha-in-15-languages/>(and,
later, Vietnamese Wikipedia too), which has let the community give us
feedback on what works and is broken, and identifying language- and
locale-specific issues we are now fixing.
In April, the Parsoid team successfully deployed the cumulative work done
over the last four months. This includes support for non-English wiki
configurations, a rewritten serialization subsystem based on server-side
DOM diffs, category link and basic template parameter editing support and a
long list of fixes and improvements.
Several other features for the July release are on track. The specification
for extensions containing
fleshed out and are currently being implemented. Similarly, our specs
for images and thumbnails<https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec#Images>were
vastly improved so that we will soon support full editing for all
We also improved our code quality and testing infrastructure.
In preparation for the July release, we did more benchmarking and capacity
planning. A caching strategy that avoids overwhelming the API with
developed, hardware to run Parsoid was ordered and work on the
Editor engagement features
In April, we deployed Notifications on the English Wikipedia and
mediawiki.org. This first release aims to inform users about new activity
that affects them on Wikipedia, such as talk page messages, page reviews,
mentions, edit reverts or thanks. Ryan Kaldari developed a new feature that
lets users mark all notifications as read, and updated the fly-out and
archive page, based on designs from Vibha Bamba. Benny Situ completed the
bundling feature and developed some of the first metrics dashboards, in
collaboration with Dario Taraborelli. Luke Welling continued to develop
HTML email notifications and a notifications mailbox. Fabrice Florin
managed the product development and release of this notification system,
and coordinated its socialization on the English Wikipedia with Oliver
Keyes. We're also grateful to Steven Walling and Matt Flaschen from our E3
team for developing the *Welcome* and *Getting started* notifications. To
learn more, visit the project
read the help page<https://en.wikipedia.org/wiki/Wikipedia:Notifications/FAQ>and
join the discussion on the talk
*Article feedback <https://www.mediawiki.org/wiki/Article_feedback>*
This month, we deployed the final release version of Article Feedback v5 on
the English <https://en.wikipedia.org/wiki/Special:ArticleFeedbackv5>,
Developer Matthias Mullie updated the back-end software in
order to re-enable the tool on the English Wikipedia, and fixed a number of
bugs reported on the German Wikipedia. Fabrice Florin worked with Pau
Giner, Oliver Keyes and community members to simplify the feedback page, as
well as finalize feedback links, auto-archive and opt-in features. Learn
more in this project
To enable feedback on articles you watch on the English Wikipedia, simply
add the 'Article Feedback
category to these pages. For more tips on how to use this version, visit
the testing page<https://www.mediawiki.org/wiki/Article_feedback/Version_5/Testing>,
and let us know what you think on the Article Feedback Talk
We are now wrapping up development for this project, and will collect
community suggestions for the next few months to prepare for upcoming votes
on the French and German Wikipedias later this year.
Design work continues and several discussions were had about what
constitutes a minimum viable product for the first iteration of Flow.
Brandon Harris is now building an interactive
help describe multiple functions.
Editor engagement experiments
In April, the Editor Engagement Experiments (E3) team focused first and
foremost on its account creation and login
MediaWiki core. The first
phase of the launch<http://blog.wikimedia.org/2013/04/25/try-new-login-accountcreation/>invited
editors and readers on all Wikimedia projects to test the new forms
on an opt-in basis, to identify bugs and localization issues across our
many wikis. We expect to release these as the default forms in May, pending
any final blockers.
For the team's *Onboarding new
* project, we completed quantitative
analysis<https://meta.wikimedia.org/wiki/R:OB4>of the latest version
of the GettingStarted landing page, and began
prototyping a new landing page and navigation system for usability testing
prior to further development and launch, which is expected in early May as
On the analytics and infrastructure front, the team handed off the product
roadmap for the User Metrics
<https://www.mediawiki.org/wiki/User_Metrics>API to the Analytics team
and colleagues in the Grantmaking and Programs
department. Ori Livneh, in support of the data analysis needs on the team,
began work supporting a Foundation instance of IPython
Last but not least, the E3 team held its second Quarterly Review
and began work planning its next high-level goals for the April–June
The fundraising team deployed a public reporting
made of aggregate live and historical fundraising data, which were notably
used by the webcomic xkcd <http://xkcd.com> to dynamically change the
outcome in the last panel of their 2013 April 1st comic,
We also upgraded the payments- and fundraising- wikis to MediaWiki 1.22,
upgraded CiviCRM to 4.2.8 and Drupal to 7, and migrated the banner
impression log pipeline to the Eqiad data center.
*Language tools <https://www.mediawiki.org/wiki/Language_tools>*
translatewiki.net home page development continues but was deprioritized due
to development efforts around changes to the Universal Language Selector.
The Mediawiki Language Engineering Bundle (MLEB) was released on April 30;
updates include localization updates to Babel, Translate extension
improvements, Xliff file format support, and easy access to message tools
menu for the translation editor. Please note that MLEB is no longer
compatible with Mediawiki 1.19. A Divehi language web font was also added.
Specifications for the Language Coverage Matrix
designed. An internationalization test strategy was presented to and
reviewed by the team.
The development team added a Divehi language web font to jQuery.webfont,
and several contribution patches to jQuery.ime were merged. Redesign
suggestions from the Product team on the Universal Language Selector (ULS)
were reviewed by interaction designer Pau Giner and accepted by the
development team. Changes include the launch workflow for ULS, as well as
changes to display settings and font settings workflows for logged-in
users. Development to reflect these changes is in progress and expected to
be completed and tested for deployment in May.
*Language engineering communications and
Highlights of this month's communications and outreach activities by the
team include UX testing with community members for ULS changes by Pau
Giner, blog posts on team programs including the Language Mavens,
translatewiki.net home page, translation UX improvements. The team also
held office hours with the community as well as a successful bug triage
focused on translate bugs.
*Commons App <https://www.mediawiki.org/wiki/Wikimedia_Apps/Commons>*
The Wikimedia Commons Android app is available in the Google Play store,
and we also added categorization support. Its iOS counterpart is available
*Wikipedia Zero <https://www.mediawiki.org/wiki/Wikipedia_Zero>*
We deployed Mobile Web's MobileFrontEnd-ZeroRatedMobileAccess decoupling
code to production. We also started the next point release to support more
object-friendly JSON-backed carrier preferences, updated carrier
preferences, fixed UI button rendering bug, and documented configuration
parameters. Last, we added content to wiki pages, and prepared for the
migration of non-embargoed content to public wikis.
*Mobile Web Photo Upload<https://www.mediawiki.org/wiki/Mobile_design/Uploads>
In April, we experimented with a login/signup call to action for logged-out
users from our in-article upload feature. This resulted in a huge spike in
new user contributions; however, the quality of the uploads was lower than
anticipated, and the quantity of inappropriate uploads was a burden on the
Commons community. In light of this, we disabled the login/signup call to
action, allowing only existing Wikimedians to see and use the upload
feature. We are still on target to reaching our fiscal year target of 1,000
unique uploaders a month and, when gated to existing users, the quality of
the uploads has vastly improved: 3/4th of the files are retained on
Commons, as compared to less than 1/4 when brand-new users were uploading.
To create a more focused uploading workflow and let mobile uploaders
discover more articles to illustrate, we also created a *Nearby* view on
the beta site, showing users a list of articles near them and highlighting
the ones that need images. We expect to release this to the full mobile web
site next month.
*Auth systems <https://www.mediawiki.org/wiki/Auth_systems>*
During April, the team primarily focused on implementing SUL v2, which will
fix issues that users are having with new security features in recent
browser releases. SUL v2 is ready for testing and deployment is targeted
for early May. In addition, the team worked toward a final design
specification for OAuth and will begin working on that pending the
successful deployment of SUL v2.
Code has been instrumented (and will soon be deployed) to log more data to
allow root cause analysis of the spurious "Zero results" issue. Some log
analysis was also done. The Puppet configuration on beta was updated to
limit lucene-search-2 memory usage on Labs.
*MediaWiki 1.21 <https://www.mediawiki.org/wiki/MediaWiki_1.21/Roadmap>*
The 1.21 deployment cycle to Wikimedia wikis is complete, and the MediaWiki
1.21 tarball is being prepared for release, with a target release date of
May 15. Mark Hershberger recently released MediaWiki 1.21rc4.
*MediaWiki 1.22 <https://www.mediawiki.org/wiki/MediaWiki_1.22/Roadmap>*
The MediaWiki 1.22 deployment cycle began in April with
April 1-10) and
1.22wmf2 <https://www.mediawiki.org/wiki/MediaWiki_1.22/wmf2> (deployed
April 15-24), with
on April 29.
*Git conversion <https://www.mediawiki.org/wiki/Git/Conversion>*
We deployed a first iteration of a Bugzilla integration plugin, which
provides notifications to Bugzilla when changes are made in Gerrit. We’ve
increased the memory allocated to Gerrit, as well as deployed a couple of
other stability fixes; both of these changes should provide some minor
performance and stability improvements to users. Finally, we’ve deployed a
new version of Gerrit that includes superior garbage collection support.
This drastically improved the compression of repositories on-disk, which
has resulted in a wide range of improvements for all users for all
operations, from cloning to pushing to commenting on changes.
Differently-sized video thumbnails now only require one reference thumbnail
(for the time position) to be generated. This helps to avoid expensive
decoding to derive thumbnails. The Score
on April 22nd. It allows users to create and document musical
scores on Wikimedia sites.
*Wikidata deployment <https://www.mediawiki.org/wiki/Wikidata_deployment>*
After a minor delay due to some job queue and infrastructure migration
work, Wikidata Phase II was deployed to all Wikipedia sites. This allows
editors to reference and display content from Wikidata inside infoboxes.
*Lua scripting <https://www.mediawiki.org/wiki/Lua_scripting>*
Some bugs were fixed and internationalization changes merged this month; no
major changes were made. The community continues to develop Lua-based
templates, such as the citation templates on the English Wikipedia.
*Site performance and
All job queues were migrated to JobQueueRedis off of the main DB clusters.
Improvements were made to the category update queries to reduce lock
exceptions that users often encountered when deleting files. This works via
a new transaction callback hook added to the core database class, which can
be used to resolve similar problems.
*Admin tools development<https://www.mediawiki.org/wiki/Admin_tools_development>
This month the team mostly worked on Single User Login
after which all user accounts will be global across all of Wikimedia's
public wikis, allowing for cross-wiki notifications and better tools for
editors. This will require all user accounts to be uniquely named and not
conflict with other accounts. The global account renaming
initial completion, and the global and local blocking based on XFF
was finished and deployed. Work on designing a global CheckUser tool was
postponed due to lack of resources.
*Security auditing and
We released the MediaWiki 1.19.5 and 1.20.4 security releases on April 15th.
*Quality Assurance <https://www.mediawiki.org/wiki/QA>*
We collaborated with Weekend Testing Americas to investigate new Account
Creation UX features with the E3 team, and tested Echo deployments with the
E2 team. We are investigating an intermittent failure with UploadWizard for
Firefox, and a styling issue with ResourceLoader in IE.
*Beta cluster <https://www.mediawiki.org/wiki/Beta_cluster>*
We started to point automated tests currently targeting test2wiki to Beta
labs to shake out issues there and ultimately improve test coverage. This
will help us with earlier detection of bugs introduced into master (such as
bug 47015 <https://bugzilla.wikimedia.org/show_bug.cgi?id=47015>). Mark
Bergsma and Antoine Musso refined the Varnish configuration for
MobileFrontend, and further refined the configuration of the search
In April, the Jenkins/Zuul platform encountered several issues such as the
gating job running tests against the current version of the branch instead
of the to-be-merged change (bug
Antoine Musso solved several performances issues by using tempfs and a new
SSD drive and upgrading Zuul to the latest upstream version.
Timo Tijhof overhauled the automatically generated MediaWiki documentation
with Doxygen 1.7 <https://doc.wikimedia.org/mediawiki-core/master/php/html/>.
He also fixed the duplicate test runs that happened in specific cases (
bug 43391 <https://bugzilla.wikimedia.org/show_bug.cgi?id=43391>). Finally
he set up QUnit tests for the VisualEditor extension; if this proves
successful, QUnit runs will be generalized to all extensions.
Mark Holmquist improved the Jenkins jobs that track Parsoid regressions
Finally, we now have linters for several languages: PHP, Python, Ruby and
even Yaml. If your git repositories are missing a lint check, please
contact us or file in a bug against Wikimedia > Continuous Integration.
*Browser testing <https://www.mediawiki.org/wiki/QA/Browser_testing>*
We created a number of new builds to point browser tests to the beta
cluster as well as test2wiki. We also normalized user strings for test
purposes on test2wiki and beta cluster wikis. We added new tests for the
Preferences/Appearance tab and SUL login, and a volunteer contributor added
a test for PDF manipulation.
We've improved the functionality of
our visualization tool, to allow users to create and edit charts via the
UI. We can also automatically deploy new instances of Limn, so it's faster
and easier to setup dashboards. In addition to current users, we expect
this to be very helpful for the Program
as they start to develop their own analytics.
We're also now importing 1:1000 traffic streams, enabling us to migrate
reports from our legacy analytics platform,
onto our big data cluster,
In the future, this will make it easier for us to publish data and
visualize reports using our newer infrastructure.
We have implemented secure login to the User Metrics API via SSL. We've
also introduce a new metric called <code|pages_created, allowing us to
count the number of pages created by a specific editor.
We improved the accuracy of the udp2log monitoring and upgraded the
machines to Ubuntu Precise in order to make the system more robust.
*Analytics Visualization, Reporting &
We published our monthly report card <http://reportcard.wmflabs.org>. As
part of Wikimedia's ongoing mobile initiative, we also helped develop
analytics that would support ongoing delivery and planning of mobile
- We've started to analyze mobile site
device class, in order to determine how we will invest in building
applications and sites that support various device formats.
- We've also started to perform session analysis of mobile site visits,
in order to help us understand user behavior when using the mobile sites,
which will inform decisions about ongoing development efforts. At present,
this data is only for internal consumption by the Mobile team.
- A new overall mobile pageviews
now available, which has improved the accuracy of our reporting due to
changes in how the MobileFrontend extension requests a wiki article
- More information about how we're calculating mobile pageviews is
available in our
We also introduced new dashboards <http://ee-dashboard.wmflabs.org/> for
our Editor engagement team, that will help them monitor the usage of the
new *Notifications* system. Finally, we've added pageview
the Hungarian and Ukranian Wikivoyages.
Engineering community team
*Bug management <https://www.mediawiki.org/wiki/Bug_management>*
A bugday <https://www.mediawiki.org/wiki/Bug_management/Triage/20130402> at
the beginning of April resulted in about 90 reports on about Skin and page
rendering being looked at and commented on. On the technical side,
Wikimedia Bugzilla's "See Also" field now also supports adding GitHub
RequestTracker URLs <https://bugzilla.wikimedia.org/show_bug.cgi?id=45589>,
and the "Bugzilla Weekly Report"
wikitech-l <https://lists.wikimedia.org/mailman/listinfo/wikitech-l> now
includes a list of open issues with highest priority, plus more
fine-grained statistics for the number of open tickets. Andre
on Bugzilla administration<https://www.mediawiki.org/wiki/User:AKlapper_%28WMF%29/BugzillaAdminPolicy>and
access restrictions, and updated the
checking Wikimedia forums (Village Pumps etc.) as sources of feedback
on problems. Furthermore, he published an initial
a Greasemonkey script that provides common one-click stock answers for
Village Pumps where software issues might get reported first before being
transferred to Bugzilla.
*Mentorship programs <https://www.mediawiki.org/wiki/Mentorship_programs>*
Quim Gil <https://www.mediawiki.org/wiki/User:Qgil> supported the Google
Summer of Code <https://www.mediawiki.org/wiki/Summer_of_Code_2013> / FOSS
Outreach Program for
candidates and mentors. He coordinated co-mentorships with
Mozilla for the Bugzilla-MediaWiki
and with MathJax for VisualEditor math support. He organized a meetup
and other open source internship
also published a post on FLOSS
Last, he met with SocialCoding4Good <http://socialcoding4good.org/> to
(re)start <http://socialcoding4good.org/organizations/wikimedia> joint
April was a slow month in Technical communications due to Guillaume
2-week medical leave. Upon his return, Guillaume helped the engineering
team with their communication support needs (reviewing blog posts and
helping with on-wiki documentation) and set up a Google custom
glossaries (similar to Wikimedia
technical search <https://www.mediawiki.org/wiki/Wikimedia_technical_search>),
to make it easier to search a term across Wikimedia
*Volunteer coordination and
Quim Gil <https://www.mediawiki.org/wiki/User:Qgil> refactored the
contributors* proposal into the more gradual Project:New
on the feedback received. He supported QA and bug management events,
organized a tech talk for 3 tech projects receiving Wikimedia
and completed the survey about best times for
volunteering<http://www.doodle.com/minqnd6ngz9npfdv>(which got 33
answers). He spoke at the Bay
Area Linux User Group <http://balug.org> with Daniel Zahn, Rob Lanphier and
Brian Wolff, and requested a proposal from Bitergia
<http://bitergia.com>to automate the generation of Community
*The Kiwix project is funded and executed by Wikimedia
In April, we released<https://blog.wikimedia.org/2013/04/17/carry-the-entirety-of-wikipedia-in-your-pocket-with-kiwix-for-android/>for
the first time Kiwix
This version doesn't provide as many features as the desktop app, but it
works well with all ZIM files. Two Kiwix developers will attend Wikimania
and have started preparing <http://www.kiwix.org/wiki/Wikimania_2013> for a
a small hackathon, two presentations and a permanent booth.
*The Wikidata project is funded and executed by Wikimedia
The team hit a big milestone with the deployment of the first iteration of
phase 2 of Wikidata on all
had been enabled on 11 Wikipedias previously).
also enabled on Wikidata, making it possible to add additional
information to certain data. Wikipedians are now able to make use of the
data available on Wikidata in articles, allowing the data to be
collaboratively collected, curated and used by all Wikipedias.The team also
fixed a few issues to make it possible to use Wikidata with Internet
Explorer 8, and worked on the time datatype. Together with bot owners, they
massively improved the time it takes for Wikidata changes to show up in the
recent changes and watchlists on Wikipedia sites. The code and architecture
got an external professional review; the reviewers were quite happy with
the quality of the code base and gave useful tips for improvements. Future The
engineering management team continues to update the
* page weekly, providing up-to-date information on the upcoming deployments
to Wikimedia sites, as well as the *engineering
*, listing ongoing and future Wikimedia engineering efforts.
Technical Communications Manager — Wikimedia Foundation
-------------- next part --------------
Please note: all replies sent to this mailing list will be immediately directed to Wikimedia-l, the public mailing list of the Wikimedia community. For more information about Wikimedia-l:
WikimediaAnnounce-l mailing list
WikimediaAnnounce-l at lists.wikimedia.org
More information about the Wikimedia-l