The report covering Wikimedia engineering activities in December 2013 is now available.
We're also proposing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge:
Below is the HTML text of the report.
As always, feedback is appreciated on the usefulness of the report and its summary, and on how to improve them.
Editor retention: Editing tools
VisualEditor
In
December, the VisualEditor team worked to continue the improvements to
the stability and performance of the system, and to add new features.
The deployed version of the code was updated three times (
1.23-wmf6,
1.23-wmf7 and
1.23-wmf8).
Most of the team’s focus was on major new features and fixing bugs.
There is now basic support for rich copy-and-paste from external sources
into VisualEditor, and a basic tool to insert characters not available
on users’ keyboards. Work also continued on a dialog for quickly adding
citation templated references, and on some major infrastructure changes,
splitting out the core of VisualEditor from the MediaWiki-specific
items like the transclusion editor.
Parsoid
In December, the relentless Parsoid team continued squashing bugs and incompatibilities; see our deployments page
for details. During the node 0.10 migration, we ran into some issues
caused by changed garbage collector behavior, and rolled back to 0.8. We
spent some time investigating and fixing this; initial testing on our
round-trip testing setup indicates that this is now fixed.
Our testing infrastructure is now exercising the entire stack
including the web server, which will help to make sure that we also
catch issues in HTTP libraries before deployment.
We wrote several RFCs about embracing a service architecture, PHP bindings for services, a general-purpose storage service based on our Rashomon revision store, and a public content API based on this.
Part of the team worked on a new PDF rendering infrastructure using
Parsoid HTML, node and PhantomJS. Part of the team has also been
mentoring two Outreach Program for Women (OPW) interns.
Core Features
Flow
In December, we deployed Flow to a few selected pages in production (
Talk:Flow and
Talk:Sandbox
on
mediawiki.org) and collected feedback about the features and design
to date from the community. The results of the feedback period are
summarized at
Flow/Research#Experienced Users.
Throughout the feedback period, we worked on implementing design
changes – such as a more compact view of the board and different
affordances for topic and post actions, as well as different
visualizations of history information – based on the comments of users
testing the software. We also began a straw poll about launching Flow as
a beta trial in the discussion spaces of WikiProject Breakfast,
WikiProject Hampshire, and WikiProject Video Games on English Wikipedia.
Based on the outcome of these polls, we hope to deploy Flow to those
pages in January.
Growth
Growth
In December, the Growth team spent time working on product development and research for upcoming
Wikipedia article creation improvements. First and foremost, the team fulfilled a request from the English Wikipedia community to launch the new
Draft namespace there. Pau Giner and others on the team simultaneously began design work on future improvements to drafts functionality (see
blog post), including recruiting for usability testing sessions.
Support
Wikipedia Education Program
In
December, we improved and fixed issues with the current Education
Program extension, and continued preliminary work towards a new version
of the software. We added a message on Special:Contributions about
users’ participation in courses, fixed a bug involving course undeletion
and tweaked related styling, addressed a breaking change in core,
improved i18n (in collaboration with Language Engineering) and began
work on notifications for course-related events. We also fleshed out
more ideas about the new version and possible synergies with other
existing and proposed functionality, and reached out to other teams for
input on this.
MediaWiki Core
Search
We’ve
continued our aggressive rolld-out of Cirrus as a Beta Feature. You can
search now 52% of pages including Commons and Wikidata via
CirrusSearch. We’ve fallen back somewhat on our goal to make Cirrus the
primary search engine. Right now, we only handle about 1.5% of search
traffic. While we will be switching more wikis over to Cirrus as the
primary search back-end in January, the theme of the month really is
adding Cirrus as a Beta Feature to more wikis, including the English
Wikipedia. We’re not sure how many wikis we’ll be able to add before we
consider ourselves out of hardware space. We’re planning on 50% more
servers in February so we’ll likely be able to finish adding wikis then.
Site performance and architecture
The team wrapped up the Puppetization of Graphite and its migration
to Ashburn, and configured Travis CI to run MediaWiki’s test suite under
HHVM on each commit to core. They also added an initial HHVM role for
MediaWiki-Vagrant and re-wrote MediaWiki’s profiling data aggregator to
be more performant. Prior to the rewrite, it was constantly saturated
and would drop data; the rewrite reduced average CPU utilization by more
than two thirds.
Auth systems
The team implemented performance fixes for CentralAuth to reduce the number of calls by anonymous users.
Wikimania Scholarships app
All
critical functionality and several stretch goals were reached for the
“final” version, which deployed to production on 2013-12-19. Chad
Horohoe stepped in in the final days leading up to launch and helped
save the i18n features that were scheduled to be scrapped due to time
constraints. Siebrand Mazeland and the wonderful volunteers at
translatewiki.net are providing translations at a rapid pace. Bryan
Davis also put in some extra hours to clean up the look and feel of the
application with a new Bootstrap-based theme. The application period for
Wikimania 2014 will open at 2014-01-06T00:00:00Z and continue until
2014-02-17T23:59:59Z. The team will continue to monitor and support the
product through the application period, and the subsequent review and
approval process of the Scholarship Committee.
Security auditing and response
We
continued to respond to reported security issues, and completed
security reviews of Flow, the Wikimania Scholarships app, and the GLAM
Wiki Toolset.
Admin tools development
The
team made several small improvements, including log entries, the
addition of global groups to Special:CentralAuth, and the addition of
global edit count to Special:MultiLock.
Release & QA
In December, the latest and greatest version of MediaWiki was released, 1.22.
This was lead by Mark Hershberger and Markus Glaser, working as the
MediaWiki release team, along with help from the Wikimedia Foundation
Release and QA team (specifically Greg Grossmeier and Antoine Musso). Of
course, this was only possible because of the great work by all of the
MediaWiki developers.
The QA team, along with Multimedia team, is working on API level
tests starting with UploadWizard. This is close to being done. Another
API level testing activity is Parsoid, with help from VE and CI
(Antoine) teams.
You can take a look at the first draft of the updated Development and Deployment process flow chart.
Quality assurance
Quality Assurance
In
December, the Quality Assurance team worked particularly closely with
the Mobile team, both supporting automated testing and also helping fix
issues with Beta labs and with Jenkins. We continued to work with the
teams from Language engineering, VisualEditor, Flow, Multimedia,
Wikidata, and Search, as well as participated in the Google Code-In
event. We are in the process of creating new support not only for
automated browser testing, but also for API testing, test data creation,
and monitoring of both test and production environments.
Beta cluster
Parsoid on the Beta cluster is now based on the
mediawiki/services/parsoid
repository and is properly self-updating whenever a change is merged in that repository via
a Jenkins job.
Beta labs played a key role in finding and fixing some significant
errors that, in combination, were causing users to see 503 errors in
production, particularly on large pages and for Mobile users. For one
thing, some timeouts on the Varnish caches had been set too low. We had
increased those for the text Varnish servers but had not done so for
Mobile Varnish servers. A tricky bug was also causing parts of large
pages to be parsed multiple times. Last, the browser tests that incurred
the 503 errors should have been capable of ignoring them. Thanks to
Beta labs, the Varnish server timeouts are now correct, the
multiple-parsing bug is addressed and the browser tests for
MobileFrontend are running correctly.
Browser testing
Besides
ongoing regression testing of Wikipedia features in cross-browser
tests, in December we made the first steps for new abilities like
testing geolocation for Mobile tests, testing and monitoring upload
ability in production, adding the ability to create test data via the
API, running tests in PhantomJS on the WMF Jenkins server, and
monitoring the Beta labs test environment for fatal errors.
Bug management
Quim Gil and Andre Klapper continued to run and coordinate
Google Code-In for Wikimedia. Andre’s draft for a
Bugzilla etiquette
received lively feedback and discussion. On the technical side, Daniel
Zahn prepared the migration of
bugzilla.wikimedia.org to WMF’s new
data center by turning the existing rudimentary Bugzilla puppet code
into a puppet module and automatically generating documentation on
doc.wikimedia.org. As part of this preparation, Daniel and Andre also eliminated nearly all Perl CPAN modules (in Bugzilla’s
/lib
subfolder) on the new server by using default distribution packages instead. Furthermore, Andre worked on a
preliminary patch to display some common queries on the Bugzilla front page.
Project management tools review
Andre Klapper and Guillaume Paumier kicked off an evaluation of Wikimedia’s
project management tools. Guillaume prepared a
consultation page with topics for stakeholders and improved it together with Andre. It will initially be sent to the
teampractices
mailing list and individual stakeholders. To facilitate getting input,
talking to individual stakeholders via Hangouts and holding an
IRC discussion are also considered.
Mentorship programs
Wikimedia’s first participation in the Google Code-In
program required a lot of dedication from the ECT members, and about a
dozen of mentors and other contributors helping creating and reviewing
tasks. Students completed about 200 tasks. The GCI inertia and the
lessons learned will help us organize a better gateway for new
contributors, which was a main reason for us to join this program. We
also believe that the experience acquired will help us make future
editions as successful with less work.
Round 7 of the FOSS Outreach Program for Women started and all projects and on track so far:
We joined Facebook Open Academy
almost at the last minute thanks to a reminder from developer Tyler
Romeo. Six projects were accepted, which will be developed by teams of
university students during the first half of 2014:
Technical communications
Volunteer coordination and outreach
Multimedia
Multimedia
In December, Mark Holmquist and Gergő Tisza updated the beta version of the Media Viewer, based on new designs by Pau Giner. This new version now features next and previous arrows, as well as faster image load and an enhanced metadata panel, as shown on this demo page.
Fabrice Florin managed product development, spearheaded the team’s Multimedia Quarterly Review meeting, hosted more roundtable discussions and presented a Multimedia Vision 2016 to get more community feedback about our goals, with help from volunteer Aaron Arcos.
Bryan Davis, Aaron Schulz and other team members helped Dan Entous and David Haskiya release a first version of the GLAM Toolset
for batch uploads by museum curators. We also started work on fixing
bugs for the Upload Wizard, which we’ll aim to improve as our primary
focus this quarter.
Last but not least, we are delighted to welcome Gilles Dubuc, who is joining our multimedia team as senior software engineer. To discuss these features and keep up with our work, we invite you to join the multimedia mailing list. .
Kraken
In
late December, the Analytics team partnered with Operations to enable
log delivery over Kafka (distributed message bus). All logs from the
edge caches serving mobile traffic are now delivered via Kafka into a
data warehouse on our Hadoop infrastructure. We’re seeing 3−4K messages
per second, with a maximum of 8K/sec over Christmas. This is a
significant step towards our goals of building an infrastructure that
can be used for analysis of all of our page views.
Wikimetrics
The
team added a small but important feature to Wikimetrics in December:
the ability to authenticate against MediaWiki OAuth. This allows users
to sign up for Wikimetrics without relying on a third party for
authentication and is an early adoption of MediaWiki OAuth.
Data Quality
The
team continues to spend a large amount of time on data quality. The
primary effort in December was in isolating and fixing an error in
WikiStats that inflated page views from July to December by a
significant amount. The error was patched in early December and the
statistics were recalculated. There were also issues with Wikipedia Zero
traffic and an outage caused by a single point of failure in the legacy
infrastructure.
Research and Data
This month, we kicked off a series of monthly research showcases as
an opportunity for the team to share what we’re learning about Wikimedia
editors and projects, and new features and programs the Foundation is
rolling out. Aaron Halfaker presented research on anonymous editors.
The first showcase was targeted at an internal audience but we’re
considering making future showcases open to anyone via a public stream.
We analyzed
the cause and impact of major over-reporting on page views in the last
months of 2013. We filtered bogus traffic from the data, and published updated reports.
We also continued work on metrics standardization and presented the rationale for this project and the results of the initial round of analysis we conducted.
This month also saw the completion of the third volume of the research newsletter,
which this year covered a total of 196 publications reviewed by
volunteer contributors. A retrospective of research covered in the
newsletter in 2013 will be published later in January.