Hi,
The report covering Wikimedia engineering activities in November 2013 is now available.
Wiki version: https://www.mediawiki.org/wiki/Wikimedia_engineering_report/2013/November
Blog version: https://blog.wikimedia.org/2013/12/09/engineering-report-november-2013/
We're also proposing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge:
https://www.mediawiki.org/wiki/Wikimedia_engineering_report/2013/November/summary
Below is the HTML text of the report.
As always, feedback is appreciated on the usefulness of the report and its summary, and on how to improve them.
------------------------------------------------------------------
Major news in November include:
Note: We're also providing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.
Personnel
Are you looking to work for Wikimedia? We have a lot of hiring coming
up, and we really love talking to active community members about these
roles.
Announcements
- Jeff Hall joined the Platform engineering group as a member of the QA team (announcement).
- Aaron Arcos joined the Platform engineering group as a volunteer developer working with the Multimedia team (announcement).
- Dario Taraborelli was promoted to the position of Senior Research Scientist, Research and Data team lead. (announcement).
- Aaron Halfaker was promoted to the position of Research Scientist (announcement).
- Moiz Syed joined the User Experience team as User Experience Designer (announcement).
Technical Operations
Wikimedia Labs
- A new dynamic proxy system has been deployed on Labs; it allows the
admin of any project to arrange for public web access and a dedicated
DNS hostname for a project instance without requesting an IP address.
Labs staff and volunteers will now be reclaiming quite a few IPs as
existing projects migrate to a dynamic proxy setup.
- The WMF has hired a short-term contractor, Mike Hoover, to assist
with the migration of Labs infrastructure from Tampa to our new
datacenter in Ashburn. Mike has spent a lot of time exploring the
existing infrastructure and running test setups; soon he will start to
configure the new OpenStack nodes in production.
- Andrew Bogott has been working on cleaning up stale and unused
resources. He's working on some automatic documentation that will help
users track the status of their projects and instances with an eye
towards predicting the impact of the coming migration.
- Labs suffered two brief outages: a brief, self-inflicted network
failure, and a longer outage during which one of the virtualization
hosts failed. Both outages were swiftly resolved, but there's a bit of
lag as some tools and services failed to come back properly afterwards
due to poor distribution of virtual servers (inter alia, both the grid
master and the shadow [backup] master were on the same server).
- Preparation for the move to the Ashburn data center is well in
progress, with the new storage server being physically configured this
week as well as the new hardware servers for user databases (including
the new PostgreSQL instance intended for OpenStreetMap).
Editor retention: Editing tools
VisualEditor
In
November, the VisualEditor team continued to improve the stability and
performance of the system, and add new features. The deployed version of
the code was updated three times (
1.23-wmf3,
1.23-wmf4 and
1.23-wmf5). Most of the team's focus was on fixing bugs, and on some major infrastructure changes, splitting out the
OOJS and
OOJS-UI libraries from VisualEditor to make them available to other teams. Much of the team travelled to the
Open Source Language Summit
in Pune, India to learn more about how to improve VisualEditor for a
variety of languages, scripts, users and systems. Two new members of the
QA team joined in to help improve VisualEditor –
Jeff Hall and
Rummana Yasmeen,
and thanks to them, the automated browser tests have expanded in
breadth and depth of coverage. Work continued on major new features like
full rich copy-and-paste from external sources, a dialog for quickly
adding citation templated references, and a tool to insert characters
not available on users' keyboards. The editor was made available by
default on just over 100 additional Wikipedias as part of the continuing
roll-out. VisualEditor was also enabled for opt-in testing on Swedish
Wiktionary and Wikimedia Sweden's wiki, the first time it has been
available on a non-Wikipedia production wiki.
Parsoid
November saw the deployment of major changes to the DOM spec in coordination with the
VisualEditor team.
Link types are now marked up by semantics rather than syntax, interwiki
links are detected automatically, categories are marked as page
properties
and more.
During the deployment, we found that the newer libraries used by the
web service front-end were buggy. We reverted the library upgrade and
contributed fixes upstream. This incident prompted us to work on tests
for the HTTP web service to catch issues like this in continuous
integration.
After these issues were sorted out, we continued with continuous
improvement and fixes. Editing support for magic words and categories
was improved, several dirty diff issues were fixed and the API was
refined for page-independent wt2html and html2wt conversion. See our deployment page for details.
Cassandra load testing
for the Rashomon storage service continued and uncovered several issues
that were reported back upstream. With Cassandra 2.0.3 the 2.0 branch
is now stabilizing in time to make deployment in December feasible.
Cassandra is now stable at extremely high write loads of around 900
revisions per second, which is more than 10 times the load we experience
in production.
Core Features
Notifications
In November, we
deployed Notifications on the German and Italian Wikipedias,
completing our worldwide release of this tool. Fabrice Florin, Denis
Barthel, Jan Eissfeldt, Erica Litrenta and Keegan Peterzell managed the
community outreach for these final releases, while Benny Situ oversaw
the technical deployments. Community response to Notifications has been
generally favorable on all wikis. While feature development has now
ended for this project, we expect new notifications and features to be
developed by other teams in coming months. To learn more, visit our
project hub, read the
help page and join the discussion on the
talk page.
Flow
Growth
Growth
In November, the
Growth team primarily worked on refactoring the
GuidedTour and
GettingStarted
extensions, including development of an API for the latter. This public
API will be used by the Growth team, the Mobile team and others to
deliver editing tasks to users across a variety of Wikipedia interfaces.
The team also spent significant time on the research and design preparations for its anonymous editor acquisition and Wikipedia article creation projects. This included participating in a community Request for Comment about a potential Draft namespace for articles, requirements gathering, and working on a Draft namespace patch.
Matthew Flaschen and Pau Giner attended the
Wikimedia Diversity Conference and
presented (along with Jared Zimmerman and Vibha Bamba) on how diversity related to the team's engineering and product work.
Support
Wikipedia Education Program
This
month, we improved a feature that was built in October (allowing
instructors to assign articles to student editors), completed a new
feature (allowing instructors to add users as students) and started
another one (displaying information about student editors' courses on
Special:Contributions). We fixed some bugs, and kept up with changes in
MediaWiki core. We also continued preliminary work—started last
month—towards renewing the UX and broadening the extension's scope.
Wikipedia Zero
During
the last month, the team monitored the rollout of Wikipedia Zero via
text (USSD/SMS) in partnership with Airtel Kenya and Praekelt for the
first pilot of the program. Additionally, Yuri Astrakhan promoted the
program abroad.
The team also prepared code and configuration for approval, finalized IP
addresses for zero-rating and deployed bugfixes for the Wikipedia app
for Firefox OS. We added support for simpler JSON in configuration
files, enhanced performance and redirect features and constrained
ZeroRatedMobileAccess extension loading to guard against repeats of last
month's configuration bug.
Mobile web projects
The
Onboarding A/B Test resulted in an Edit Guider, now available. The
overlay UI overhaul currently in beta is planned to become available on
the main site. User profiles intent is also in testing in beta.
- Team highlights for this past month include a very successful Open Source Language Summit
in Pune, India co-organized with Red Hat. More than 60 developers
joined in to collaborate and work together on improving language support
for Wikipedia on the web and mobile. Work sprints on integration of
input methods in VisualEditor, Indic Fontbook specification, mobile
input methods and content translation were held.
- The team also fixed and deployed several issues related to
performance and saving preferences for the Universal Language Selector
(ULS). Other tasks completed include creating a class for interlanguage
links using where the Autonym font can be used only for autonym items.
The team also worked on collating documentation about all initial
inclusion requests for each web font served through ULS also documented
in the font.ini files of each font in the repository.
MediaWiki Core
DevOps Sprint 2013
The
DevOps sprint participants focused their efforts towards monitoring
related work, specifically getting Logstash in production and
puppetizing/migrating Graphite (both still in-progress). Cache related
fixes were made to avoid users seeing outdated version of pages when
using non canonical URL forms. A fix was made to the commons upload
process to update all articles that use that page as users would expect.
Search
Before
November 18, we were spinning up an aggressive plan to add many new
wikis to CirrusSearch. On November 18, we had multiple incidents that
caused us to roll all wikis using CirrusSearch back to Lucene; we've
spent the rest of November implementing fixes for all issues discovered
on the 18th. That is now done and we plan to switch all wikis that used
to have CirrusSearch back to running it as a secondary search engine on
December 2. We'll attempt to restart our aggressive plan as soon as
we're comfortable with it again.
Site performance and architecture
We
ran a controlled experiment to test the impact of module storage on
performance. We expect to publish our findings within a week. We
puppetized Graphite and MediaWiki's profiling log aggregator and
migrated them to our Ashburn data center. Finally, we started working on
a replacement profiling log aggregator that will process and visualize
profiling data from both client-side and server-side code.
Auth systems
Our
preliminary version of OAuth is now live on all Wikimedia wikis. Since
the rollout, five OAuth consumers have been accepted. We're hopeful many
more consumers will be proposed.
Wikimania Scholarships app
Work
is progressing towards a planned launch of the application on
2013-12-19. The source code has been imported into an internal git
repository and is now being managed via gerrit. A bugzilla component has
been created under the Wikimedia product to track defects and feature
requests. Several changesets are in review to complete the basic
functionality of the application and prepare for an internal security
review.
Security auditing and response
We
released a security update to MediaWiki to fix a number of issues in
core and extensions. Security reviews of Limn, GWTools and Flow
extensions are in progress.
Admin tools development
Quality assurance
Quality Assurance
November saw significant improvements to the
QA documentation on
mediawiki.org contributed by both staff and volunteers. Participants in the
Google Code-in
program made even more contributions, to both documentation and browser
test code. The QA team welcomed new staff members Rummana Yasmeen and
Jeff Hall, who made immediate contributions to the VisualEditor project
and to the browser test automation.
Beta cluster
In
November, the Beta cluster saw greatly improved support for testing
Parsoid, the parsing engine behind VisualEditor. The Beta cluster also
continues to provide a real-world simulation for the Flow project in
advance of Flow's limited release scheduled for December. Beta continues
to be the the main test environment for MobileFrontend, CirrusSearch,
and many other Wikimedia software projects.
Browser testing
In
November, we added significant browser test coverage for the Flow
project, and the addition of Jeff Hall to WMF staff brought a focus to
testing VisualEditor. Browser tests now reside in ten different
repositories across WMF projects. November saw a increased browser test
coverage for the Language, VisualEditor, and Flow projects, among
others. The diversity of browser tests in project repositories has been a
force behind great improvements in infrastructure, with code shared
among the projects now residing in the repository at mediawiki/selenium.
In November, the Engineering community team held their second monthly showcase, as well as their quarterly review for the July–September period.
Bug management
Mentorship programs
Technical communications
Volunteer coordination and outreach
Multimedia
Multimedia
Kraken
We
continued to make progress on event delivery via Kafka. We identified
and tested solutions for issues encountered with event delivery from the
Amsterdam data center. We also tested solutions to fix Ganglia logging
issues.
Wikimetrics
We concluded Phase 1 of Wikimetrics, by implementing asynchronous cohort validation, editor survivor and threshold metrics.
Data Quality
We identified issues with over-counting page views, and deployed a fix in November. Data from July onward were restated.
Research and Data
This month, we started work on
metrics standardization, one of the team's quarterly goals. We published a number of supportive analyses of
new user acquisition,
activation and
retention as well as
"active editors"
to assess issues and potential benefits of new definitions. The outcome
of this analysis will inform design decisions for new dashboards
focused on editor engagement.
In collaboration with the Platform team, we ran an A/B test to determine performance gains of localStorage. The results
indicate that the use of localStorage significantly improves the site's
performance for the end user: Module storage is faster. Readers whose
pages load slower tend to browse less. Mobile browsers don't seem to
benefit substantially from caching.
We published the results of a test designed to explore if
displaying a short tutorial could improve the first-edit completion rate
of newly-registered users on mobile devices. The results support the hypothesis, indicating that edit guiders are a good onboarding strategy for new mobile users.
We ran an analysis of anonymous editor acquisition
as background research for new onboarding strategies designed by the
Growth team and found that editors who edit as an IP right before
registering an account are our most productive newcomers.
On November 9, 2013 we hosted the inaugural
Labs2 Wiki Research Hackathon:
it was the first in a series of global events meant to "facilitate
problem solving, discovery and innovation with the use of open data and
open-source tools" (read the
full announcement). Highlights from the event are available in the latest issue of the
Research Newsletter. We are planning to host a new hackathon in Spring 2014 and we are actively
seeking volunteers to host local and virtual meetups.
The Kiwix project is funded and executed by Wikimedia CH.
- We have released two new versions of Kiwix for Android this month (1.5 & 1.6), providing many new features; most of them were developed by young new developers as part of the Google Code-in program. We have also released a new and unique tool to easily create ZIM file yourself
from data on your hard drive; the tool is stable and can now be used.
Work continues around tools based on Parsoid output, especially as we
need to rewrite the ZIM-related code for the Mediawiki offline toolchain, currently under heavy re-engineering.
The Wikidata project is funded and executed by Wikimedia Deutschland.
- Wikidata developers held an office hour to give a status update and answer questions (read the log).
In addition, they worked on ranks, ordering of statements and the
quantities datatype. The quantities datatype is needed, for example, to
enter the number of inhabitants of a country in Wikidata. It is
available for testing now on http://test.wikidata.org.
Ranks will allow for certain statements to be marked as preferred or
deprecated. This is for example useful to indicate a previous mayor of a
city, or the number of inhabitants of a country in 1900.
- Magnus Manske wrote a gadget
that allows you to additionally show Wikidata search results when doing
a search on Wikipedia. He also extended the Reasonator tool to now also
work for cities. Until now, it only supported people and species.
Future
- The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.
--
Guillaume Paumier
Technical Communications Manager — Wikimedia Foundation
https://donate.wikimedia.org