Hi,
The report covering Wikimedia engineering activities in September 2013 is now available.
Wiki version: https://www.mediawiki.org/wiki/Wikimedia_engineering_report/2013/September
Blog version: https://blog.wikimedia.org/2013/10/02/engineering-report-september-2013/
We're also proposing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge:
https://www.mediawiki.org/wiki/Wikimedia_engineering_report/2013/September/summary
Below is the full HTML text of the report.
As always, feedback is appreciated on the usefulness of the report and its summary, and on how to improve them.
------------------------------------------------------------------
Major news in September include:
Note: We're also providing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.
Personnel
Are you looking to work for Wikimedia? We have a lot of hiring coming
up, and we really love talking to active community members about these
roles.
Announcements
- Kartik Mistry joined the Language Engineering team as Software Engineer (announcement).
- Sucheta Ghoshal joined the Language Engineering team as associate software engineer (announcement).
- Kaity Hammerstein joined the User experience team as Associate UX Designer (announcement).
- Aaron Halfaker joined the Analytics team as Research Analyst (announcement).
- Oliver Keyes transitioned to the role of Product Analyst (announcement).
- Dan Garry joined the Product development team as Associate Product Manager for Platform. (announcement).
- Nick Wilson joined the Product development team as Community Liaison (announcement).
Technical Operations
Site infrastructure
- Work to refactor and modularize our Puppet repository continues:
this month, lots of dead code was removed, and some tiny miscellaneous
classes aggregated with more relevant components. Work on git-deploy has
also restarted this month. Changes were made to make git-deploy easier
to configure and to make initial setup of new repositories and setup of
new minion targets completely automated.
- Many of the services within the Tampa data center have already
migrated to EQIAD, however there remain several smaller, unique, or in
some cases orphaned services that we still need to document or scope
prior to the closure of this center. Several of these services may no
longer be required, and we expect there to be some discussion about how
they are migrated or maintained going forward. Additionally, several of
these systems need to be moved to the new secondary data center (see
below), and will be waiting until infrastructure is in place to do so.
Our goal is to try to have these systems moved before the end of 2013,
but we'll continue to have equipment in this location for as long as
necessary to ensure stability of our network.
- For EQIAD, the ordering process is underway to complete our fourth
row of machines, and ensure we have capacity to take in systems that
will be arriving from the sunsetting of the Tampa data center, as well
as handle our expected growth.
- For ULSFO, after a long initial setup, initial bootstrapping and
configuration of the systems is finally underway. Over the next several
weeks, we will be configuring, testing, and redirecting traffic at this
location.
- Lastly, work has begun on a definition and RFP process for a new,
secondary data center, likely on the west coast of the US. We will send
further updates on this project once our RFPs are complete and we begin
the selection process. Our hope is to have this facility ready to take
systems from Tampa by the end of 2013.
Data Dumps
- The GSoC incremental dumps project has drawn to a close, but User:Svick will still be around. There's work to be done
before this can go into production, as well as extensive testing and
code review from folks with C++ expertise. If you want to help, check
the repository.
Wikimedia Labs
- The DNS infrastructure of the Labs has been overhauled and much
improved. The hardware switch to replace Labs' NFS server unreliable
hardware is ready, and should be enabled this week. Yuvaraj Pandian has
created and deployed a new instance proxy with an OpenStack-style API.
The new proxy is in use for a small number of instances right now, but
will be expanded to most instances in the future. The new proxy uses
nginx with Lua code to read its configuration of virtual hosts from
redis and can handle arbitrary URLs to arbitrary back-ends.
Editor retention: Editing tools
VisualEditor
In
September, the VisualEditor team continued their work to improve the
editor and roll it out to additional wikis. The deployed version of the
code was updated four times (
1.22-wmf16,
1.22-wmf17, 1.22-wmf18 and
1.22-wmf19).
The focus in the team's work this month was to continue to improve the
stability and performance of the system, fix a number of bugs uncovered
by the community, and make some usability improvements.
Parsoid
We
fixed a few bugs reported in production, added performance stats to our
RT-testing framework (and discovered a couple bugs and fixed them as a
result) and did some long-standing cleanup work in our codebase.
September also saw the all-staff meeting at the WMF offices in San
Francisco which gave us the opportunity to work in person and discuss
some proposals. We planned out an implementation strategy for language
variant support, and started researching and experimenting with HTML
storage options which is required for a number of projects in our
roadmap.
Core Features
Notifications
In September, we
released Notifications on more Wikipedias,
such as the Dutch, Hebrew, Japanese, Korean, Spanish, Ukrainian and
Vietnamese. Fabrice Florin and Keegan Peterzell managed community
relations for these new releases, and are reaching out to more projects.
Our next deployments will take place
every other Tuesday.
Developer Benny Situ was responsible for these deployments and fixed a
number of bugs, with the help of Erik Benhardson and Matthias Mullie.
Community response has been very positive so far, across languages and
regions. For each release, we reached out to community members weeks in
advance, inviting them to translate and discuss the tool with their
peers. As a result, we have now formed productive relationships with
volunteer groups in each project, and are very grateful for their
generous support. To learn more, visit our
project hub, read the
help page and join the discussion on the
talk page.
Flow
Growth
Growth
In September, the
Growth team (formerly known as Editor Engagement Experiments, or E3), primarily worked on the
onboarding new Wikipedians project. In particular, this included the creation and deployment of two new
guided tours to teach any new user how to make their first edit, using
wikitext or
VisualEditor.
The guided tours extension was also deployed to the following language
editions of Wikipedia: Catalan, Hebrew, Hungarian, Malay, Spanish,
Swedish, and Ukrainian.
Along with the renaming, the team held its third Quarterly Review (minutes are available), published its 2013–2014 product goals, and shared a new job opening for two additional software engineers.
In accordance with our 2013-14 goals, the Growth team began research into
modeling newcomer retention on Wikipedia,
anonymous editor acquisition, and
article creation improvement.
Support
2013 Wikimedia fundraiser
This
month, the team mostly focused on preparing for the upcoming English
fundraiser. Planning began for periodic tests throughout October, which
will help determine the launch date and other aspects of our fundraising
efforts in November and December.
Wikipedia Zero
This
month, the team released enhanced URL rewriting and debug flag-only
Edge Side Includes (ESI) banner inclusion to production, supported the
Ops implementation of dynamic MCC/MNC carrier tagging, identified web
access log and user agent anomalies, further analyzed and recommended
load balancer IP address-related changes in support of HTTPS
requirements, and tested JavaScript-based Wikipedia Zero user interface
enhancements.
Mobile web projects
In
September, we mostly focused on Tutorial A/B testing, Notifications
overlay in Beta, and adding campaign tracking to MobileFrontend.
MediaWiki Core
MediaWiki 1.22
In September, MediaWiki 1.22wmf16 through 1.22wmf19 were deployed to the production Wikimedia Foundation cluster.
Multimedia
Admin tools development
Search
In
September, we expanded the new CirrusSearch back-end to a number of
wikis. Italian Wiktionary, Catalan Wikipedia and English Wikisource are
all running CirrusSearch now. Additionally, we deployed to all "closed"
wikis. Further feature refinement and bugfixing are ongoing, with
roughly 2 to 3 deployments a week.
Auth systems
The
team improved the user interface of OAuth and deployed these changes to
mediawiki.org and
test.wikipedia.org. We hope to test and refine the
extension with third party developers, and subsequently deploy to all
wikis. An initial review of
Extension:OpenID
was performed, and several issues were brought to the attention of the
extension maintainer. Several bugs with CentralAuth/SUL were also fixed.
Security auditing and response
The
team responded to reported issues, and released MediaWiki 1.21.2,
1.20.7 and 1.19.8 security releases to fix several issues in core and
extensions.
Quality assurance
Quality Assurance
This
month, we wrapped up Rachel Thomas' Outreach Program for Women
internship successfully. Rachel helped us extend our browser test
coverage of VisualEditor. Besides our ongoing collaboration with
Wikimedia Foundation development projects, we are also engaging the
greater community on the
QA mailing list, where we discuss both code contributions and general QA topics.
Browser testing
This
month saw significant improvements to both coverage and speed in our
tests for VisualEditor. We are collaborating with the Language team on
browser tests for the UniversalLanguageSelector extension and
Translatewiki.net. We created our first tests for the new Flow feature
and are in the process of supporting Flow fully in a reference test
environment. We presented yet another of our ongoing series of training
sessions, this one live in San Francisco.
Bug management
Mentorship programs
Technical communications
Guillaume Paumier wrapped up work on supporting the deployment of VisualEditor, and resumed regular activities like preparing the
Tech newsletter and ongoing communications support for the engineering staff.
Volunteer coordination and outreach
Together with
XWiki and
Tiki, we submitted a
Wiki devroom proposal for
FOSDEM,
the biggest open source source conference in Europe. We are also
preparing a proposal for a stand, lead by volunteers at the nascent
Wikimedia Belgium chapter. The overall goal is to achieve a good
MediaWiki & Wikimedia tech gathering in Brussels next February. We
are also supporting the organization of the
MediaWiki Architecture Summit in San Francisco on 23-24 January, 2014.
- The team has been focused on smaller but more important work items
this month, including enhancement to Wikimetrics, Grantmaking and
Program Developments graphing infrastructure and fixing some
long-standing Limn bugs. On the infrastructure side, our collaboration
with Ops has the Kafka middleware project moving along nicely. The
all-staff meeting and travel schedules definitely impacted our
throughput this month.
- Two notable accomplishments should be called out: our Hadoop
environment is now 100% free software, as we swapped out a proprietary
JDK for OpenJDK 7. We also spent a lot of time on our engagement
processes and planning for our first combined quarterly review in
October, and made significant process on our hiring goals.
Research and data
This month,
Aaron Halfaker
joined the research team as a full-time employee. We started to
reorganize the team structure and engagement model in coordination with
the Analytics developers. We performed a
survival analysis of new editors in preparation for new experiments led by the
Growth team, and worked with the team to iron out the data collection and experimental design for the
fortcoming iteration of
GettingStarted.
We worked with product owners to determine the initial research
strategy for features with key releases scheduled for the next two
quarters (Mobile Web, Beta Features, Multimedia, Flow, Universal Language Selector, Content translation). We started a cohort analysis of conversion rates for mobile vs desktop account registrations; the results will be published on Meta shortly.
We drafted a proposal to host tabular datasets
in a dedicated namespace and solicited feedback from interested parties
(particularly the Wikidata community). We also started fleshing out the
Labs2proposal,
an outreach program for academic researchers and community members,
launched at Wikimania 2013 in Hong Kong. We co-hosted the second IRC
research office hours and prepared for the first Wikimedia research hackathon, an offline/online event to be held in various locations worldwide on November 9, 2013.
Last, we contributed to the
September 2013 issue of the Wikimedia research newsletter.
The Kiwix project is funded and executed by Wikimedia CH.
- Mediawiki offliner is now pretty stable, and its first release will happen in October. The ZIM incremental update GSoC project
was successfully completed; we still need to do a little bit work to
finish the integration in the openZIM and Kiwix code bases. libzim, the
openZIM reference implementation, has been packaged for Debian.
The Wikidata project is funded and executed by Wikimedia Deutschland.
- In September, the Wikidata team mainly concentrated on the sister
projects Wikimedia Commons and Wiktionary. For Wikimedia Commons, we
added the ability to store interwiki links in one central location
(Wikidata) together with the ones for Wikipedia and Wikivoyage. For
Wiktionary, we published an analysis of all existing proposals for the integration of Wikidata and Wiktionary.
- On Wikidata itself, we rolled out the URL datatype. This for example
allows you to provide a URL as a source of a statement. Denny Vrandečić
published 2 blog posts about the ideas behind Wikidata: "Wikidata Quality and Quantity" and "A categorical imperative?". In addition, he shared a few thoughts on the future of Wikidata before leaving the project at the end of the month.
- The development team is looking to hire another front-end developer experienced in JavaScript.
Future
- The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.
--
Guillaume Paumier
Technical Communications Manager — Wikimedia Foundation
https://donate.wikimedia.org