Hi,
The report covering Wikimedia engineering activities in September 2014 is now available:
https://www.mediawiki.org/wiki/Wikimedia_Engineering/Report/2014/September
Below is the HTML text of the report.
------------------------------------------------------------------
Major news in September include:
- a call for candidates for the Free and Open Source Software Outreach Program for Women;
- a roundtable discussion between the Language engineering team and editors from the Catalan language Wikipedia, focusing on the Content Translation tool.
Engineering metrics in September:
- 151 unique committers contributed patchsets of code to MediaWiki.
- About 27 shell requests were processed.
There are many opportunities for you to get involved and contribute to MediaWiki and technical activities to improve Wikimedia sites, both for coders and contributors with other talents.
For a more complete and up-to-date list, check out the Project:Calendar.
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.
- Damon Sicore joined the Wikimedia Foundation as Vice President of Engineering (announcement).
- Rachel Farrand joined the Engineering Community Team as Events Coordinator (announcement).
- Jeff Hobson joined the Wikipedia Zero engineering team (announcement).
- Daisy Chen joined the UX Research team as Associate Design Researcher (announcement).
Dallas data center
- In September we have setup (backup) replication of most project data, including core databases and external storage. Work on Swift images and system backups was still ongoing into October. Essential system infrastructure such as an installation server, DNS, LVS, NTP etc. has been deployed as well.
Tampa data center
- We started the last push to get the remaining services & systems out of our Tampa data center, with a deadline for shutdown of all systems on October 1st. The remaining services included PDF generation, mail servers, noc.wikimedia.org and LDAP.
Labs metrics in September:
- Number of projects: 146
- Number of instances: 415
- Amount of RAM in use (in MBs): 1,996,288
- Amount of allocated storage (in GBs): 20,435
- Number of virtual CPUs in use: 977
- Number of users: 4,083
Wikimedia Labs
- Wikitech (the Labs web interface) is now managed via the standard WMF deployment system. This should allow for more frequent MediaWiki updates and overall greater stability.
- The last historic remaining dependencies on our old Tampa datacenter (e.g. LDAP and Labs DNS backup servers) were finally stamped out and replaced with dependencies on Dallas hardware.
- One of the labs virtualization hosts (virt1006) was suffering intermittent problems, so all affected instances were migrated to other hosts in order to stave off possible future disaster. Consequently, Labs is a bit short on virtualization space, but new hardware procurement is under way.
- Several long-unused instances and projects were cleaned up in order to free up more space.
- The last of the ToolLabs replica DB servers was upgraded to MariaDB 10.
Editor retention: Editing tools[edit | edit source]
VisualEditor [edit]
In September, the team working on VisualEditor expanded browser support, improved some features, and fixed
nearly 60 bugs and tickets.
Users of Internet Explorer 10, who we were previously preventing from using VisualEditor due to some major bugs, will now be able to use VisualEditor; this follows on from Internet Explorer 11 support last month. When editing a template with a required field, VisualEditor now warns you to avoid leaving it blank, and you can now create auto-numbered links using VisualEditor.
Improvements and updates were made to a number of interface messages as part of our work with translators to improve the software for all users, based on feedback from users and user testing. We made progress on table structure editing and auto-filled citations, both of which will be coming soon.
The deployed version of the code was updated five times in the regular release cycle (
1.24-wmf20, 1.24-wmf21,
1.24-wmf22,
1.25-wmf1 and
1.25-wmf2).
Editing [edit]
In September, the Editing Team made substantial progress on front-end standardisation, as well as the work on VisualEditor which is reported separately. The team welcomed
Bartosz "MatmaRex" Dziewoński as a new team member, and existing student member
Moriel Schottlender converted to full-time status.
The team's work on front-end standardisation is focussed on improving libraries and infrastructure, and in particular, the OOjs UI library. This included the creation of a MediaWiki theme in collaboration with the Design team, which can be explored in the online demo; this will be deployed into MediaWiki's use of OOUI in the next few weeks. A number of bugs were fixed, including working around window and popup sizing, over-flow item placement, and working around some browser bugs in Firefox and Safari. The code documentation has a number of minor issues corrected, and the build process was extended to create a minified distribution. The OOjs library was updated to fix a minor bug in oo.Compare
, with a new version (v1.1.1) released and pushed downstream into MediaWiki, VisualEditor and OOjs UI.
The TemplateData extension now supports the "
autovalue
" parameter property, a wikitext value that a parameter can be set to have inserted by default if desired. Also, the
specification for TemplateData was re-written to be clearer and more consistent. Next month the TemplateData GUI editor will be made available on all Wikimedia wikis.
Parsoid [edit]
In September, we continued to fix bugs, upgraded libraries, and made additional progress towards improving compatibility with PHP parser + Tidy rendering. Specifically, Parsoid's paragraph wrapping now targets the PHP parser + Tidy output rather than PHP parser output. We also continued to update Parsoid's CSS / rendering to more closely match the current rendering. We also improved Parsoid's robustness handling edge case scenarios (pathological backtracking, parsing of very large wikitext tables). Part of the Parsoid team was also busy with launching the PDF rendering service which was successfully launched end of September.
Services and REST API [edit]
September saw a lot of activity on the
RESTBase storage and API service. A new 'pagecontent' composite bucket type using revisioned blob buckets was introduced. This uses the by-now fairly rich table storage backend to provide functionality similar to MediaWiki's revision table, and supports any number of revisioned types of content (like HTML, wikitext, JSON metadata) associated with each revision.
Work on secondary index updates continued at full steam, and is now close to being merged.
Flow [edit]
In September, the Flow team enabled new test pages on French WP and Hebrew WP. The French test is for the Forum des Nouveaux, a Help space for new contributors (similar to the Teahouse on English WP). The Forum des Nouveaux hosts reached out to the Flow team after Wikimania, excited to try out the new discussions system. The Hebrew WP test is helping the team diagnose problems for Right-to-Left languages, and general i18n issues. The team also refined the new Echo notifications functionality, with lots of feedback from contributors on Mediawiki.org and En.wp. New topic notifications are now bundled in Echo, and we fixed several bugs related to the behavior of the Alerts and Messages tabs, and getting excess mention notifications.
Growth [edit]
In September, the Growth team shut down, with workflows shifting into the mainstream of other teams.
Wikimedia Apps [edit]
In September, the Mobile Apps Team released a new version of the iOS app containing the Nearby feature which shows you articles about things that are near your location, and a references panel that pops up whenever you tap a reference. The team also released an iOS 8 compatibility build to market. The team also spent time performing code quality improvements and refactoring on both the iOS and Android apps.
Mobile web projects [edit]
This month the mobile web team focused on the first prototype of WikiGrok, a new contribution feature that asks users who are reading Wikipedia articles to help add Wikidata that is missing about the article subject. Over the course of the month, we built and user-tested the first experimental interface for allowing users to input Wikidata: a simple binary question mode that provides the user with a suggested occupation on biographies that are missing this information in Wikidata but contain a possible occupation in the Wikipedia article. In this early test phase we are storing the replies in a separate database, not pushing to Wikidata. We plan to add suggestions for more Wikidata fields and test this version against a slightly more complex tagging interface in beta in October.
Wikipedia Zero [edit]
During September 2014, the team built more Partners Portal architecture, including Graph extension integration components for eventual display of aggregate statistics to zero-rating partners (it already works and is being reviewed in house). The team also grew support for dynamic zero-rating banners while enhancing JSON configuration extension code and issuing bugfixes. Additionally, the team shrunk the size of the Wikipedia favicon to reduce bandwidth usage by users across the web. And on the partner side, we launched Telenor Myanmar in September. Finally, the team welcomed its newest software engineer, Jeff Hobson, to the Wikimedia Foundation!
Language tools [edit]
The CLDR extension was updated to version 26 and entries identical to CLDR were removed from LocalNamesEn.php. The team made RTL fixes in core, Echo and Wikibase, and tested Flow for RTL support. Maintenance of the Translate extension continued, and the performance of translation memory was improved on ElasticSearch with the help of Nik Everett
Language Engineering Communications and Outreach [edit]
Content translation [edit]
The second version of the tool
was released. This version has not yet been deployed due to technical issues in the Labs setup. This is currently being resolved with the Ops team.
Notable improvements include:
- a basic formatting toolbar (for Chrome);
- more accurate warnings for unchanged machine translated content;
- design improvements for the top bar and progress bar;
- bi-directional support for Spanish-Portuguese machine translation;
- link adaptation improvements.
The team is performing ongoing tests with users for Spanish-Portuguese, Portuguese-Spanish translations, and we started planning for the
third release.
Admin tools development [edit]
Search [edit]
In September we worked to mitigate the performance bottleneck that we found in August. We found there to be no silver bullet but used the information we learned to pick and order appropriate hardware to handle the remaining wikis. We also implemented out significantly improved wikitext Regular Expression search. In October we've begun rolling out the wikitext Regular Expression search and received some of the hardware we need to finish cutting over the remaining wikis. We believe we'll get it all installed in October and cut the remaining wikis over in November.
SUL finalisation [edit]
In September, the team wrapped up the feature development for SUL finalisation. One part of the work (the steward end of the rename request form) is outstanding and will be finished in October. In October, the team is planning to proceed into deployment and testing of the features.
Security auditing and response [edit]
We published the 1.23.4 security release, and completed review for the Graph and Imagemetrics extensions.
Multimedia [edit]
Bug management [edit]
Phabricator migration [edit]
Wikimetrics [edit]
Work was done on the following metrics:
- Rolling New Active Editor - Implemented
- Rolling Surviving New Active Editor - Implemented
- Pages Created and Edits - Updated to include reporting configuration to include changes to deleted pages (this is a default).
- Metrics with ‘Namespaces’ as a parameter let you specify “All Namespaces.” Leave the input field blank to do so.
- Rolling recurring old active editor is implemented, but does not perform sufficiently rapidly for us to enable it on the production servers.
- The status of the implementation of Standardized Metrics defined by the Research Team is here: https://meta.wikimedia.org/wiki/Research:Metrics_standardization/Implementation
Data Processing [edit]
A terrific weekly summary is posted to the Analytics mailing list with a summary at the top of each email. Here are the links to related posts in the archives.
Editor Engagement Vital Signs [edit]
The
Vital Signs dashboard is now live. We are calling it “Vital Signs” because it will eventually display content and readership metrics, not just Editor Engagement metrics. Vital Signs was presented at the Analytics Quarterly Review as well as the October
WMF Metrics meeting.
EventLogging [edit]
Work was performed to clean up some EventLogging tables per the privacy policy.
Research and Data [edit]
This month we onboarded
Ellery Wulczyn as the newest addition to the Research & Data team. Ellery recently finished a Computer Science Masters program at Stanford and joins us as a full-time research analyst after completing a summer fellowship with University of Chicago's
Data Science for Social Good program. His focus at WMF is going to be fundraising research and analytics. Welcome, Ellery!
We completed the definitions, documentation and requirements for a new set of metrics to be implemented in Vital Signs.
We completed a first draft of a page view definition, which is currently being discussed. We supported the mobile team with baseline traffic reports for Apps and Mobile Web.
We participated in the preparatory sessions for the design of an open consultation led by the Community Liaison team as well as in regular meetings to support the strategy consultation process.
We held our
Q1-2015 quarterly review, reviewed the team's progress against Q1 goals and posted our proposed
Q2 goals.
The Kiwix project is funded and executed by Wikimedia CH.
Kiwix and Wikipedia offline will soon be available on an e-ink device.
- We made progress in our work with our partner Bookeen to get an e-ink reader able to read Wikipedia offline. We managed to get a first version of the device firmware working, and it will be tested in the field as part of the Malebooks pilot deployment.
- As a consequence of a bug fixing sprint with Parsoid and Wikisource developers at Wikimania, we were finally able to generate usable ZIM files from Wikisource (dxample with fr.wikisource.org).
- Work on the offline project Gutenberg continued and we are now almost ready to release. A few ZIM files are in testing, for example in German and in Spanish.
- Kiwix was represented at the Selenium conference where we held a 2-day bug hunting session: 120 bugs were reported, of which 50% were fixed.
- mwoffliner was improved to make it easier to use for everyone, in particular to make ZIM files for only a selection of articles. As a demonstration, we prepared a ZIM files containing all Wikipedia articles about medicine.
- After many years, a new version of a tool to generate a live CD including Kiwix and Wikipedia content was released.
The Wikidata project is funded and executed by Wikimedia Deutschland.
- In September, the Wikidata team focused on improving performance, doing groundwork for the new user interface design, and making it possible to track where data from Wikidata is used. Next to that, they worked on tests and prepared for a week-long meeting with the WMF multimedia team and volunteers to discuss and plan structured data support for Wikimedia Commons.
- The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.
--
Guillaume Paumier