Hi,
The report covering Wikimedia engineering activities in November 2012 is now available.
Wiki version: https://www.mediawiki.org/wiki/Wikimedia_engineering_report/2012/November
Blog version: https://blog.wikimedia.org/2012/12/06/engineering-november-2012-report/
Like last month, we're also proposing a shorter and simpler version of this report for less technically-savvy readers:
https://www.mediawiki.org/wiki/Wikimedia_engineering_report/2012/November/summary
Below is the full HTML text of the report, as previously requested.
------------------------------------------------------------------------
Major news in November include:
Note: Like last month, we're proposing a shorter and simpler version of this report for less technically savvy readers.
Personnel
Are you looking to work for Wikimedia? We have a lot of hiring coming
up, and we really love talking to active community members about these
roles.
Announcements
- Quim Gil joined the Engineering Community Team of the Platform
engineering group as "Technical Contributor Coordinator (IT
Communications Manager)" (announcement).
- Juliusz Gonera joined the Mobile team as Software Developer (announcement).
Technical Operations
Site Infrastructure
- Mark Bergsma made a breakthrough in resolving an old and elusive
instability issue in Varnish which occurs when they are under extreme
load or experiencing hanging connections/packet load. The problem turned
out to be the slow epoll thread. When under load and once the pipe
buffer (64 KB) is full, the writing Varnish worker threads block, and
the server situation deteriorates rapidly. Mark fixed this issue by
moving the reading of the sessions earlier in the epoll event loop,
before the thread does anything else, thereby reducing the size of the
pipe buffer. With this enhancement, Mark is confident he could further
reduce the number of Varnish servers in our caching infrastructure.
- Asher Feldman is happy to report that the memcached instances on the
app servers in Tampa are no longer in use. This will give us back an
extra 2GB of RAM on many of the app servers (which only have 8 or 12GB
to begin with) which can go towards increasing PHP capacity. It also
improves the stability of the site by addressing some of the root causes
of multiple site outages, and brings with it multiple client
improvements including consistent hashing, igbinary serialization, and
better timeout handling. The total cache pool has increased from 140GB
to 1392GB, enough to currently meet full parser cache requirements from
RAM. Sessions are no longer stored in memcached at all but have been
migrated to redis, which will provide replication to the stand-by
datacenter. In addition, performance is quite a bit better as well, as
can be seen by comparing the max value in the 90th and 99th percentile
times in the attached graph.
- In recent months, we've seen a high hardware failure rate with our
batch of Swift servers. After discussion with our vendor, they agreed to
replace all those servers with newer hardware. All the required servers
to replace the Tampa Swift servers have just arrived. We are in the
process of migrating data from the old servers to the new ones, but it
will take time to drain traffic, remove the old hardware from production
and slowly ramp up the new machines. Ariel Glenn's current plan is to add 2 servers per week.
- After several months of testing and tweaking, Peter Youngmeister
finally rolled out the new Apache-on-Precise build on all our Tampa app
and imagescaler servers. This will be the same (and tested) image that
we'll be using on the Ashburn App servers in the coming month.
- Thanks to the efforts of Leslie Carr and Mark Bergsma, we are now a RIPE NCC member,
and with this membership, we may be eligible to receive a one-time
allocation of a /22 of IPv4 address space from the last of /8 of IPv4
address space. This is particularly important to us since we have run
out of IPv4 addresses in Europe.
- The SSL cluster was upgraded to Ubuntu Precise which provided a
newer version of nginx and openssl, closing out the CRIME vulnerability
and giving us the possibility of using HTTP 1.1 to the back-end. Testing
of HTTP 1.1 for proxying will occur in the future.
Fundraising
- The fundraising season started. Jeff Green and Leslie Carr rolled
out the new Ashburn Fundraising server cluster and it is currently
handling all payments. Leslie applied and tested firewall rules for the
new cluster. There were lots of bug fixes and small improvements to
configuration management, monitoring, and logging to cluster
administration by the Operations and the Fundraising tech teams. Jeff
built out the second payments messaging box (ActiveMQ) as a hot standby.
A new wiki was deployed for the Fundraising email unsubscribe page, to
segregate it from sensitive services (payments, CiviCRM). Specifications
for new payments bastion hosts were started.
Data Dumps
- Media bundles are back in business at your.org
now that the network issues have been fixed. Work has started on
upgrading the OS on the servers that produce the dumps, rebuilding the
necessary packages and testing. The 'add/changes' experimental dumps
have been running stably long enough that we've made them available on
the gluster public data volume accessible to all Labs projects.
Wikimedia Labs
- Andrew Bogott continues to work on some long-term OpenStack issues.
There's a new project, Moniker, which should (eventually) allow us to
properly integrate the Labs cloud with our DNS back-end and provide
better stability and a bit more user control. He continues to work on
other more basic OpenStack work which will eventually trickle into Labs.
- Andrew has also been fiddling quite a bit with the usability of
OpenStackManager, which is the GUI for labsconsole. The interface is now
marginally easier to use and understand, and improvements are ongoing.
Others
- Chris Johnson has relocated from Tampa to work in our Ashburn
datacenter. Steven Bernardin is now the main Tampa data center engineer.
Editor retention: Editing tools
VisualEditor [edit]
In
November, the team worked primarily on finalizing the code
re-engineering of VisualEditor so that it is more modular and easier to
extend, and on the integration ahead of deploying it for wider testing
in December. The early version of the VisualEditor on
mediawiki.org was
updated twice (
1.21-wmf4 and
-wmf5),
fixing a number of bugs and missing wikitext compatibility, and
wide-spread improvements to much of the user interface code so that it
will be easier to change in future.
Parsoid [edit]
In
preparation for the upcoming deployment on the English Wikipedia, the
Parsoid team concentrated on the preservation of existing content.
Automated round-trip testing on 100,000 randomly chosen pages from the
English Wikipedia using distributed test runners helped to identify many
issues, which were fixed and often resulted in new minimal test cases
being added to the parser test suite. Currently, 79.4% test articles (up
from about 65% last month) round-trip without any differences at all,
an additional 18% round-trip with only minor (whitespace, quote style
etc) differences, and the remaining 2.6% of pages have differences that
still need fixing (down from about 15% last month). Selective
serialization will further avoid dirty diffs in unmodified parts of a
page by using the original wikitext for those. This will help further
fix the 20% of pages that had any kind of difference in wikitext. The
implementation of this algorithm is currently being finalized.
Editor engagement
Article feedback [edit]
This month, we continued to develop final features for
Article Feedback, and researched how people are using this tool on the English Wikipedia. With the help of community members, we designed
new features
to reduce the editor workload, including improved moderation tools and a
more prominent feedback link. These features will be developed next
month, once we've completed code re-factoring to improve database
performance. We also analyzed new research data to track
how moderators use the feedback page,
and measure how many readers who post feedback become editors or
registered users. Next month, we will invite Wikipedians to evaluate the
usefulness of feedback posts and the effectiveness of our new
moderation tools. Once these tasks are done, we plan to release Article
Feedback v5 to 100% of the English Wikipedia in early 2013. For more
information about this tool, check our
project overview.
Page Curation [edit]
Page Curation
is now in 'maintenance mode', following its release on the English
Wikipedia in September 2012. We have been tracking the impact of this
tool with a
metrics dashboard, which confirms that it is being used actively, with over 27,000 pages reviewed since launch. To learn more, visit our
introduction page, watch this
video tour or read this
tutorial. If you are an experienced editor, try out the
final version on the English Wikipedia.
MicroDesign [edit]
The
Agora extension moves ever-closer to completion, with help from Munaf
Assaf, Trevor Parscal, Rob Moen and Vibha Bamba. Several templates on
the English-language Wikipedia have been redesigned to reduce interface
clutter, with some already implemented.
Editor engagement experiments [edit]
In November, the Editor Engagement Experiments team (E3) deployed the third and final A/B test of the new
account creation page,
including client-side validation. Results from basic data analysis of
all three tests were published on Meta, and the project will now move to
the productization stage.
Extension:PostEdit
was put in maintenance mode after being deployed to a further seven
Wikipedias, including French and Portuguese. On the analytics side, E3
transitioned permanently to
Extension:EventLogging
for data collection purposes, and collaborated with the mobile team to
track activity on Wikipedia's mobile beta. Last but not least, the team
also deployed a small design improvement to the
personal tools menu in MediaWiki core.
Multimedia
UploadWizard [edit]
The
work of Ankur Anand (a.k.a drecodream) on Flickr integration, done
during GSoC, has now been merged, and Wikimedia engineers are working
towards its deployment in the near future. Specifically, several bugs
related to Internet Explorer were fixed. Once all the bug fixes are
deployed, the feature will be turned on for Commons (hopefully in early
December). Initially it will only be available to administrators.
Architecture & Platform support
Notifications [edit]
This
month, we designed and started building key features of the
Notifications project (code-named 'Echo'), towards a first experimental
deployment in early 2013. Fabrice Florin wrote detailed
feature requirements for our first release, and Vibha Bamba designed the first components of the
user experience.
Ryan Kaldari and Benny Situ developed the main features of this
application, including the notifications flyout, the all-notifications
archive, as well as email notifications and preferences. To test our
work in progress, visit our
first prototype (create an account and post on your talkpage from a separate account). New employee
Luke Welling is also starting work on an HTML email module for this project. For more information, visit our
project hub, or check our
overview slides.
Messaging [edit]
The official start of Flow will follow
Echo development. An initial team will be forming next month to explore solutions here.
Support
2012 Wikimedia fundraiser [edit]
November
has been a busy month for Fundraising as the team helped to kick-off
the annual 2012 fundraiser on November 26th with heavy testing before
then. So far the 2012 fundraiser has been a resounding success raising
over $12M in the 5 full days and limited testing days since November
15th. For current information, see the
live stats.
Shortly before the full launch, it was announced that the annual
fundraiser would be splitting into an English-language fundraiser in
Australia, Canada, Great Britain, the United States and New Zealand
during the traditional November/December period with other languages and
all countries in April. For more details see the
announcement on wikimedia-l.
- The Mobile team (Jon Robson, Juliusz Gonera, Arthur Richards and Max
Semenik) deployed several features to our beta and production mobile
web infrastructure this month. To beta, we deployed experimental edit
functionality, reformatted tables, random article support, simpler
layout for cleanup templates, and watchlists. For production, we added
log-in support.
GeoData Storage & API [edit]
We plan to start deploying Solr-based GeoData in early December.
Wikipedia Zero [edit]
This
month we've worked with volunteer developers at the Bangalore DevCamp
on our SMS feature, and are preparing for the launch of additional
partners in the next few weeks.
J2ME App [edit]
We
approved the initial J2ME app and are exploring next steps for
deployment and creating additional versions to support a larger base of
handsets.
Wikipedia over SMS & USSD [edit]
This month, we've worked with volunteer developers at the
Bangalore DevCamp
to start supporting an important variant of our upcoming text messaging
support. We currently have the SMS/USSD combination working and
awaiting launch, and we are now working on the SMS-only version for
carriers that cannot support USSD.
Mobile QA [edit]
We
have created several automated Mobile browser-based tests that are now
running our Cloud Bees/Sauce Lab Continuous Integration configuration.
Both Platform engineering and Mobile QA are leveraging Watir Webdriver
and
Cucumber. We also continue to add to our
Mobile Browser Regression Tests.
Offline
Kiwix
- A new project, Phpzim, was started with the support of Wikimedia CH. This project will create a binding in PHP of the zimlib,
allowing any PHP developer to easily create and read ZIM files. This is
the first stone of a bigger project to allow quick ZIM file generation
in Mediawiki (and also other PHP CMSes). Work on ZIM Autobuild continues and Kiwix ZIM throughput increases slowly (4 files in November). Small testing stage of Kiwix 0.9rc2 will finally start in early December, followed by the release.
MediaWiki Core
MediaWiki 1.21 [edit]
Wikimedia engineers deployed
1.21wmf3 and
1.21wmf4 to all Wikimedia sites, and began deploying
1.21wmf5 (with
a momentary breakage).
These updates included many significant improvements, including
one-click (AJAX) patrolling, for both new page and diff patrol, and a
Template Sandbox, which lets users preview changes to a template by previewing an example page where it's used.
Git conversion [edit]
We're
still very much looking forward to deploying the latest version of
Gerrit (see last month's update), but unfortunately remain blocked on a
complicated LDAP propagation issue. Chad Horohoe is working with the
Gerrit developers on finalizing the fix for this issue. Chad also
attended the Gerrit Developer Summit in November, and both Chad and Rob
Lanphier attended the Gerrit Users Summit (
notes).
TimedMediaHandler [edit]
We
have deployed TimedMediaHandler to all wikis. Jan Gerber and Michael
Dale continue to fix bugs. Jan Gerber and Aaron Schulz are working on an
improved file upload mechanism in UploadWizard to make larger file
uploads more practical.
Wikidata deployment [edit]
Chris
Steipp and Chad Horohoe have reviewed the Wikibase set of extensions,
as well as DataValues. Deployment of these extensions is planned for
December.
Wikivoyage migration [edit]
Wikivoyage
was launched into public beta on November 10. The site is running on
Wikimedia servers, and accounts and text content was migrated. Images
from the old site have not been automatically imported, because some
contain non-free content, and need to be added to each language wiki in
accordance with the
Exemption Doctrine Policy for that site. Public announcement and promotion of the site is delayed while the community is working on the image transfer.
SwiftMedia [edit]
Thumbnails
(and math/timeline files) are now written to nas1 and Swift. More
improvements have been made to FileBackend to avoid extra HEAD requests
for 404 errors. Webm thumbnails use temporary Swift URLs to support
range requests. Feature requests and bugs reports are filed against Ceph
as MediaWiki takes advantage of other Swift features.
Lua scripting [edit]
Brad Jorsch and Chad Horohoe have joined Tim Starling on this project. Brad has built a
template sandbox
which will help in debugging both Lua scripts and regular templates.
Chad is working on a shared repository for scripts, and Tim has been
extending the API. His latest work has been around adding multilingual
APIs for handling things like plurals within Lua. We're currently
seeking a volunteer product manager to help out with the roll-out of this.
Site performance [edit]
Various
improvements to the job queue have been made to avoid CPU time wasted
on duplicate jobs and redundant page cache purges. Changes have also
been made to make it possible to edit heavily used templates without
timeouts.
Incremental architectural improvements [edit]
Support
needed for more complex data structures (lists, sets) in memcached
(with atomic updates) is awaiting more review and testing. The coding is
essentially done (
changeset).
Admin tools development [edit]
The team's work continued on writing an
interface for Stewards to mass-lock user accounts and on being able to use
AbuseFilter extension across all wikis at once.
MediaWiki 1.20 [edit]
Mark Hershberger published the
MediaWiki 1.20 stable tarball on November 7th. Chris Steipp published a security update (1.20.1) on November 29th.
REST proposal [edit]
Wikia
wants to attract motivated app developers and companies using Wikia's
products to use the API. They also want to make the API more
standards-compliant (a RESTful interface, using HTTP verbs), but that's a
high-level goal. Mobile-related work is first, but this redesign would
improve the whole platform, including the enterprise. The Wikimedia
Foundation and Wikia want to work together on this; The Wikimedia
Foundation also wants to avoid boxing ourselves into special-purpose,
specific apps. Wikia developer Federico Lucignano is currently working
on a Request for comments on the REST proposal.
Security auditing and response [edit]
The
team continued to respond to several reported vulnerabilities, and
released new versions of all supported MediaWiki branches (1.20.1,
1.19.3, 1.18.6) to address vulnerabilities in core. Significant security
reviews continued for Wikidata and Wikivoyage extensions.
Quality assurance
QA and testing [edit]
The team contributed to the
community QA draft strategy
and presented the Acceptance Test-Driven Development concept to
Wikimedia Product/Project managers. Regression testing of software
deployments is ongoing.
Beta cluster [edit]
We deployed
ArticleFeedbackv5 to the beta cluster, which is the primary host for AFTv5 testing, including browser test automation.
New Page Patrol is being maintained there as well. We are still working on issues of ongoing maintenance, and this cluster
played a role in catching a defect that recently escaped to production.
Continuous integration [edit]
A
continuous integration summit occurred during the
Netherlands Hackathon.
integration-jenkins2 is now fully operational with Jenkins / Gerrit and
a Zuul installation. Antoine Musso has generated the new MediaWiki core
Jenkins jobs.
Zuul
has been deployed in production successfully. It triggers a new set of
Jenkins jobs that will eventually replace the old MediaWiki-.* ones. The
new Jenkins jobs for MediaWiki core (triggered by Zuul) have been
tested in production and are successful. The
new workflow has been documented.
Browser testing [edit]
In November, the QA team created a
backlog of tests to be automated,
ported existing tests from RSpec to Cucumber, and is now working on
browser testing architecture, creating basic new tests (see the
qa/browsertests
repository in Gerrit), and refactoring tests for cleanliness. Chris
McMahon began discussing automated browser tests with Wikimedia tech
managers to get developers writing those tests as they develop
extensions deployed on Wikimedia sites; public announcement will be
coming very soon, when the existing example tests are in final or
near-final form. Noisy tests failing for known reasons have been removed
from the suite, which is now completely green (that is, passing); the
team will soon be writing and adding more tests. Browser tests in
November identified a serious regression in UploadWizard running on
test2 and prevented its release to production.
Analytics
Kraken (Analytics Cluster) [edit]
The
Analytics team has received all of the hardware purchased back in the
Spring. The Hadoop nodes have been moved onto their final homes. Evan
Rosen from the Global Development team is helping us test this setup
with real use cases for his team. Kafka has been puppetized and
installed. It is currently consuming all of Banner Impression- and
Wikpedia Zero-related logs. As a proof of concept, the Zero logs are
being fed daily into Hadoop for analysis by the Global Development team.
Debs for Storm have been built. Storm has been puppetized and is
running on several of the Cisco nodes.
Limn [edit]
David
Schoonover and Dan Andreescu are working on a major rework of Limn,
using Knockout.js and d3.js. The team hopes to have this ready to
present the metrics for the December 6 metrics meeting at the Wikimedia
Foundation.
Bug management [edit]
Andre Klapper improved and cleaned up updated large parts of the
bug management and Bugzilla documentation. This includes the beginnings of a
triage guide. He also published his
Greasemonkey scripts in a Git repository
and went through obsolete extensions and updated their Bugzilla
descriptions. Andre started analyzing how Wikimedia engineering teams
use Bugzilla and their related workflows. He also investigated a
potential upgrade of Bugzilla to version 4.2 by doing some basic
testing. Furthermore, a
wikitech-l discussion on standardizing the meaning of "highest priority" in Bugzilla resulted in creating a new "Immediate" priority status.
Mentorship programs [edit]
The first phase of the
Outreach Program for Women
(OPW) has been completed, receiving the submissions of more than 15
firm candidates, delivered to 8 mentors available. The Wikimedia
Foundation is funding 4 full-time internship positions between January
and March 2013. There is a possibility to obtain more, depending on
external sponsors of the program. The selected candidates will be
announced on December 11. The OPW is organized by the GNOME Foundation
and 11 FLOSS projects are taking part.
Technical communications [edit]
Management reviewed options to determine the direction this activity would follow in future months. In the meantime,
Guillaume Paumier cleaned up and expanded the
Wikimedia glossary
with terms related to Wikimedia technology and engineering, and
volunteers & engineers came to expand it further. He also followed
up on the
consultation process
initiated in October to identify how to improve dialogue between
technical communities and user communities. He's now in the process of
widening this discussion to more communities.
Sumana Harihareswara sent a
call for volunteers to lead or advise Wikimedia engineering staff on select activities, and followed up on the offers.
Volunteer coordination and outreach [edit]
Sumana Harihareswara started sharing new volunteer coordination tasks with
Quim Gil,
the new technical contributor coordinator who started working with the
Wikimedia foundation in November. They continued to follow up on
contacts (such as those gained at October's Grace Hopper Celebration of
Women in Computing), recruit new contributors to the Wikimedia tech
community, and mentor newer contributors. The
weekly online tech chats continued on Thursdays. Sumana and others continued to grant
developer access and work on
Gerrit project ownership requests.
Language tools [edit]
In
November 2012, the Language Engineering team travelled to India for 10
days together with the Mobile team for 6 events in total: the two-day
Language Summit at the Red Hat offices in Pune, a
Language Engineering Community Meetup in Pune, the three-day
DevCamp 2012 Bangalore, a
Language Engineering Community Meetup in Bangalore, a presentation by
Erik Moeller on the current state of tech in the Wikimedia Foundation, and
Coffee with Arky, a meetup of Mozilla users.
The rest of the month, development time was spent on completing the
Universal Language Selector, and getting it to a state where it could be
put in maintenance mode for a few months. In April 2013, phase two of
the ULS will start, will consist of adding content language selection.
The Language Engineering designers completed the design for the
Translation UX project, for which development has commenced end of November, and will continue for
8 sprints of a fortnight, until mid-March 2013.
Milkshake [edit]
The first phase of the
Universal Language Selector (ULS) was completed in November. The jQuery modules
jQuery.ULS,
jQuery.IME,
jQuery Webfonts and
jQuery i18n have had their first stable version. The
Universal Language Selector MediaWiki extension is now being used on
Wikidata. During the
DevCamp in Bangalore,
experimentations were done with ULS in Android, a Chrome extension was
created to make jQuery.IME usable in the Chrome web browser, and an
extension for Firefox implementing the input methods is underway. The
first contributions by non-Wikimedia developers have been made, which
indicates that the jQuery extensions are getting some attention. The
Wikimedia Language Engineering team will now put the modules and
MediaWiki extension in maintenance mode until April 2013.
The Wikidata project is funded and executed by Wikimedia Deutschland.
- The repository side of Wikidata has been launched on http://www.wikidata.org.
It contains the results of phase 1 (language links) and has already
attracted a community to maintain the wiki. Meanwhile, the Wikidata team
has continued work on Phase 2 of Wikidata (Infoboxes) to add statements
with values to the items in the Wikidata repository. The team improved
the propagation of changes from the repository to the client and the
messaging in Recent Changes. There is a constant exchange with Wikimedia
Foundation engineers about the upcoming deployment cycle. Feedback and
questions are welcome on the mailing list and on meta.
Future
- The engineering management team continues to update the Software deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.
--
Guillaume Paumier
Technical Communications Manager — Wikimedia Foundation
https://donate.wikimedia.org