Hi,
The report covering Wikimedia engineering activities in August 2013 is now available.
Wiki version: https://www.mediawiki.org/wiki/Wikimedia_engineering_report/2013/August
Blog version: https://blog.wikimedia.org/2013/09/06/engineering-august-2013-report/
We're also proposing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge:
https://www.mediawiki.org/wiki/Wikimedia_engineering_report/2013/August/summary
Below is the full HTML text of the report.
As always, feedback is appreciated on the usefulness of the report and its summary, and on how to improve them.
------------------------------------------------------------------
Major news in August include:
Note: We're also providing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.
Personnel
Are you looking to work for Wikimedia? We have a lot of hiring coming
up, and we really love talking to active community members about these
roles.
Technical Operations
Data Dumps
- All dumps ran from the data center in Ashburn this month; only the
miscellaneous and experimental services remain to be moved. GSOC student
Petr Onderka completed the first incremental dump-producing code, along with a draft specification for the new format. Test it out and let us know what you think!
Wikimedia Labs
- Due to Wikimania and staff vacations, this month had a relatively
low number of infrastructure changes, but we had a relatively high
influx of users and tools. We ran three workshops during Wikimania and
helped Toolserver users migrate their tools to Labs. We did have a few
infrastructure changes, though: A change for the service group interface
was merged but not yet deployed. It removes the service group interface
from the project interface, reducing clutter. An API was pushed in for
project and service group information, to make the information available
from Wikitech, rather than LDAP. Other infrastructure changes were
bugfixes, which can be found through bugzilla.
OTRS
- OTRS got a long overdue update to version 3.2.9 with the generous
volunteer support of Martin and Marcel of Znuny GmbH. As part of the
upgrade, the service was migrated from pmtpa to eqiad, and spam
filtering was overhauled.
Editor retention: Editing tools
VisualEditor
In August, the VisualEditor team continued work, and presented and ran workshops at
Wikimania in Hong Kong to discuss how to best improve the system. The deployed version of the code was updated three times (
1.22-wmf13,
1.22-wmf14 and
1.22-wmf15),
with several mid-deployment releases as the code was developed to patch
urgent issues. The focus in this work was in improving the stability
and performance of the system, fixing a number of bugs uncovered by the
community, and making some usability improvements.
Parsoid
In
August, the Parsoid team continued to polish compatibility with
existing wikitext. User feedback after the July VisualEditor release was
instrumental in the identification of issues and the development of
support for important use cases of creative templating.
The increased team size also allowed us to perform some long-standing
code cleanup, make Parsoid compatible with Node 0.10, and improve
testing. The round-trip testing infrastructure
received a much-needed overhaul. The storage back-end switched from
SQLite to MySQL, which improved throughput a lot and is allowing us to
test new code far more quickly than before. Performance statistics are
now recorded, which will let us identify performance bottlenecks as well
as catch performance regressions.
During Wikimania, the Kiwix team used Parsoid output to create an
offline copy of Wikivoyage. With standard HTML libraries and the rich
RDFa information in the Parsoid DOM, downloading and modifying the HTML
representation was done in about
1000 lines of JavaScript.
Editor engagement features
Notifications
In
August, we released Notifications on the French, Hungarian, Polish,
Portuguese and Swedish Wikipedias, after extensive testing on the
English Wikipedia, as well as
mediawiki.org and Meta-Wiki. This
engagement tool was well received by our new communities, especially
social features such as Mentions and Thanks, which enable users to
communicate more effectively than before. Benny Situ led the engineering
work for this deployment and fixed a number of bugs, with the help of
Erik Benhardson and Matthias Mullie. Fabrice Florin managed community
relations for these new releases, updating this
release plan
and reaching out to more projects, to prepare for worldwide deployments
on all wiki projects in coming months. To that end, we teamed up with
Philippe Beaudette, Maggie Dennis, Patrick Earley, Jan Eissfeld, Anna
Koval, Keegan Peterzell, and Sherry Snyder to coordinate these releases
with the communities they serve. Dario Taraborelli created new metrics
dashboards
for French,
Hungarian,
Polish,
Portuguese and
Swedish Wikipedias. Lastly, we presented our work on Notifications in two talks at Wikimania 2013, with both
a general overview and
a technical presentation (
see slides).
We are very grateful to all our community champions for each language
and look forward to more collaborations in the future. Our next major
deployment to non-English Wikipedias will take place on Sep. 17, to be
followed by weekly releases throughout the fall, as outlined in our
release plan. To learn more, visit the
project portal, read the
help page and join the discussion on the
talk page.
Flow
Article feedback
In August, we made a few feature tweaks and bug fixes for the
Article Feedback Tool (AFT5) on the
English and
French Wikipedias. Matthias Mullie released a few patches to improve the
opt-in/opt-out tool, and tested the new
feedback notifications
to let users know when feedback is marked as useful for a page they
watch (or for a comment they posted). We also presented our work on AFT5
at Wikimania 2013, with designer Pau Giner and our French and German
champions Benoît Evellin and Denis Barthel, in
this session (
see slides).
The team plans to make the AFT5 tool available to other wiki projects
interested in testing this tool later this year, as outlined
in the release plan.
Editor engagement experiments
Editor engagement experiments
Commons App
This
month, the Mobile Apps team pushed out additional releases of the
Commons photo uploader app for iOS and Android. The iOS version includes
a major UI revamp by Monte, while the Android version has received
multiple incremental updates by Yuvi and Brion. Yuvi has been working on
modernizing support for campaigns in UploadWizard, which will make it
easier to coordinate uploads for events like Wiki Loves Monuments.
Viewer, contributor, and admin user interfaces for campaigns will come
to the web, with campaign-tied uploading in the web and mobile app. The
team also started making plans for the next generation of the Wikipedia
reader app, which will be more closely integrated with the mobile web
site to ensure that new features are always available through a web
view, even where we don't create specific native support. More details
will be put together in the next couple months.
Wikipedia Zero
This
month, the team completed version 1 of Wikipedia Zero automation tests,
continued programming the re-architecture of Wikipedia Zero,
implemented search engine non-indexing, and analyzed HTTPS requirements
in support of a push for greater usage of HTTPS across Wikimedia
projects. The Wikipedia Zero engineering team thanks Amit Kapoor from
the Wikipedia Zero partnerships team, who wrapped up work with Wikimedia
Foundation this month, for his hard word getting the program off the
ground. And the team is also pleased to welcome Carolynne Schloeder, who
joins the Wikipedia Zero program as Director of Mobile, Programs.
Mobile web projects
This
month we continued to improve the mobile editing feature, monitoring
and triaging bugs and expanding the feature show at the section level of
articles. We also released the first iteration of mobile notifications
to projects where Echo is enabled (English, French, Polish, Portuguese,
Hungarian, and Swedish Wikipedia, as well as and Meta). In beta, we
built a new notifications treatment to be released in later months and
continued working on mobile talk pages.
Language tools
MediaWiki Core
Multimedia
Search
In
August we deployed CirrusSearch to
test2.wikipedia.org and
mediawiki.org and we're testing there. We're actively looking for other
volunteers to test out CirrusSearch. Right now, CirrusSearch is not the
primary search for
mediawiki.org; you have to use a URL parameter to
test it. We're hoping to make it the primary in September.
Auth systems
The
team deployed OAuth to
mediawiki.org on Aug 20th, and are working on
enhancement requests before the extension is enabled on the rest of the
WMF wikis. Several small bugs were fixed in SUL.
Security auditing and response
The
team responded to reported issues, and prepared for the next MediaWiki
release, scheduled on September 3. We worked with Operations to enable
HTTPS for user logins in most geographies.
Quality assurance
Quality Assurance
This
month QA began collaborating closely with Release Engineering to
coordinate improvement of reporting, monitoring, and testing software
releases. Our goal is to make our frequent software releases even more
reliable than they already are, and to use the tools and systems in
place today such as the beta labs cluster to make those reliable
releases even more frequent.
Browser testing
This
month saw a significant change to the structure and organization of
browser tests, with tests and builds for CirrusSearch,
UniversalLanguageSelector, and VisualEditor following the example of
MobileFrontend and now residing in the git repositories for those
extensions, rather than in the /qa/browsertests repository. This creates
opportunities for more frequent and more accurate Jenkins builds of the
tests, while also reducing the overhead required for analyzing test
failures.
Bug management
Mentorship programs
Technical communications
Guillaume Paumier
continued to focus on the VisualEditor deployment effort, working on
communications, documentation and liaising with the French Wikipedia.
Work on technical communications mostly focused on perennial activities
like
Tech news and ongoing communications support to the engineering staff.
Volunteer coordination and outreach
Analytics infrastructure
We continue to pursue the initiatives listed in our planning document. We've had one analyst accept a job offer (welcome
Aaron!)
and are in discussions with a software engineer. We continue to have a
solid pipeline and are spending a lot of time interviewing. Wikimetrics
is on target for an early September release and we've made good progress
against our hadoop infrastructure goals. In co-operation with Ops,
we've completed our reinstall of the Hadoop cluster and run several days
of reliability testing over the labor day weekend. We are currently
investigating replacing the Oracle JDK with the Open JDK to be in line
with our goals of using open source whenever possible. Our project to
replace udp2log with Kafka is making steadily progress. Varnishkafka,
which will replace varnishncsa, has been
debianized
and the first performance tests of compressing the message sets are
very encouraging. We created a test environment in Labs to test Kafka
failover modes and we have been prototyping with
Camus
to consume the data from a broker and write it to HDFS. We are right
now thinking about how to set up Kafka in a multi data-center
environment. The Zookeepers have been reinstalled through Puppet as
well.
Analytics Visualization, Reporting & Applications
In
close collaboration with Dario, Jaime and Jessie, we have worked on new
features for Wikimetrics. In particular, we are adding new metrics such
as survival,
pages created,
aggregation of metrics,
metadata in the CSV output, a
support page and we have now more than
90% test coverage
of the codebase. In preparation for the reinstallation of the Hadoop
cluster, we moved all Wikipedia Zero jobs off the cluster. We took this
opportunity to add additional monitoring to the creation of Wikipedia
Zero dashboards. We have worked with Wikipedia Zero to identify a
problem with Geolocation of requests that has created large jumps in
total traffic. We spent quite some time creating a more robust process
for updating and monitoring
gp.wmflabs.org. This dashboard is used by
various internal stakeholders and receives its information from
different datastreams using different scripts. We have been working on
running these scripts under the general purpose stats user, adding
additional monitoring to prevent stale data and puppetized some of the
jobs.
Data Releases
In August, we attended WikiSym and Wikimania.
Dario Taraborelli gave a keynote address on
actionable Wikipedia research at WikiSym, where several other
Wikipedia research papers were presented. At Wikimania, we hosted two sessions focused on
Wikimedia data and
analytics tools. We also worked with Platform engineering this month on analyzing and visualizing
HTTPS failure rates by country,
in preparation for the switch to HTTPS as a default. We released new
dashboards for the launch of notifications on 5 other Wikipedias and
continued to provide ad-hoc support to teams in Editor Engagement. Last,
we continued screening and interviewing candidates for an open research
analyst position.
The Kiwix project is funded and executed by Wikimedia CH.
- Release of the new Mediawiki offliner
was a little bit delayed; we are still fixing stability bugs. This
solution has already proven its efficiency, as we have released 20 new
ZIM files this month: a new throughput record. The ZIM incremental update GSoC project
progresses too, as the student works on the integration of
zimdiff/zimpatch in the Kiwix ecosystem. Kiwix developers have had a 6
days hackathon in Hong-Kong to prepare the next Kiwix release, after some final work on compilation.
The Wikidata project is funded and executed by Wikimedia Deutschland.
- In August, the Wikidata team was present at 3 events: COSCUP, Wikimania
and a meetup about Wikidata and Incubator. A lot of work has been put
into improving the API and its documentation. The team also worked on
the ability to reorder the qualifiers and sources in a statement,
improved the speed of Wikidata slightly, and made progress on the
ability to query for statements with a specific property and value, as
well as merging items. An improved proposal for the support of Wiktionary has been published. They also started the paper cuts
initiative to find and fix small bugs that have a large impact on how
enjoyable it is to use Wikidata. Denny and Adam gave a short overview of
the state of Wikidata and answered questions during an office hour on IRC. The biggest news for August though was the activation of data access (Wikidata phase 2) on Wikivoyage.
Future
- The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.
This report was written collaboratively by Wikimedia engineers and managers. See revision history and associated status pages. A wiki version is also available.
--
Guillaume Paumier
Technical Communications Manager — Wikimedia Foundation
https://donate.wikimedia.org