Hi,
The report covering Wikimedia engineering activities in August 2014 is now available. My sincere apologies for the delay.
Wiki version: https://www.mediawiki.org/wiki/Wikimedia_Engineering/Report/2014/August Blog version: https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/
Below is the HTML text of the report.
------------------------------------------------------------------
Major news in August includes:
- the Wikimania 2014 conference https://wikimania2014.wikimedia.org/wiki/ in London, and the associated hackathon https://wikimania2014.wikimedia.org/wiki/hackathon ; - a statement https://blog.wikimedia.org/2014/08/01/wikipedia-zero-and-net-neutrality-protecting-the-internet/ on Wikipedia Zero and net neutrality; - progress on the new content translation tool https://blog.wikimedia.org/2014/08/26/content-translation-100-published-articles-and-more-to-come/ and its passing the milestone of 100 translated articles.
Engineering metrics in August:
- 160 unique committers contributed patchsets of code to MediaWiki. - The total number ofunresolved commits https://gerrit.wikimedia.org/r/#q,status:open+project:%255Emediawiki.*,n,z went from around 1640 to about 1695. - About 22 shell requests https://www.mediawiki.org/wiki/Shell_requestswere processed.
Contents
- Personnel https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Personnel - Work with us https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Work_with_us - Technical Operations https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Technical_Operations - Features Engineering https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Features_Engineering - Editor retention: Editing tools https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/_Editing_tools - Services https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Services - Core Features https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Core_Features - Growth https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Growth - Mobile https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Mobile - Language Engineering https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Language_Engineering - Platform Engineering https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Platform_Engineering - MediaWiki Core https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#MediaWiki_Core - Release Engineering https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Release_Engineering - Multimedia https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Multimedia - Engineering Community Team https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Engineering_Community_Team - Analytics https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Analytics - Wikidata https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Wikidata - Future https://blog.wikimedia.org/2014/10/18/engineering-report-august-2014/#Future
PersonnelWork with us https://wikimediafoundation.org/wiki/Work_with_us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.
- Senior Software Engineer – Services http://grnh.se/748mr2 - Software Engineer – Services http://grnh.se/k012qi - Software Engineer – Maps & Geo – Mobile http://grnh.se/4w5iyb - Software Engineer – Mobile – iOS http://grnh.se/j4gmk5 - Release Engineer http://grnh.se/5fw24x - Technical Writer http://grnh.se/d86hti - Full Stack Developer – Analytics http://grnh.se/vfgr0t - Research Analyst http://grnh.se/r8ukg1 - Agile Coach/ScrumMaster – Team Practices Group http://grnh.se/5h4jdv - Operations Security Engineer http://grnh.se/m65tn8 - Technical Project Manager http://grnh.se/bu7wlt - UX Senior Designer http://grnh.se/47xrkn - UX Senior Design Researcher http://grnh.se/x2nsqv - UX User Research Recruiter http://grnh.se/sry6g0 - UX Visual Design Fellowship http://grnh.se/783knf - Mobile Partnerships Regional Manager http://grnh.se/v75upy - Project Coordinator – Engineering http://grnh.se/d7gjcn
Technical Operations
*Dallas data center* On August 21, our first connectivity to the new Dallas data center (codfw) came online, connecting the new site to the Wikimedia network. The following week, all network equipment was configured to prepare for server installations. The first essential infrastructure services (install server, DNS, monitoring etc.) were brought online in the days following August 25, and we are now working on deploying the first storage & data base servers to start replication & backups from our other data centers.
Labs metrics in August:
- Number of projects: 170 - Number of instances: 480 - Amount of RAM in use (in MBs): 2,116,096 - Amount of allocated storage (in GBs): 22,600 - Number of virtual CPUs in use: 1,038 - Number of users: 3,718
*Wikimedia Labs* Andrew fixed a few sudo policy UI bugs (68834 https://bugzilla.wikimedia.org/show_bug.cgi?id=68834, 61129 https://bugzilla.wikimedia.org/show_bug.cgi?id=61129). Marc improved the DNS cache settings and resolved some long-standing DNS instability (70076 https://bugzilla.wikimedia.org/show_bug.cgi?id=70076). He also set up a new storage server for wiki dumps. This should resolve some long-term storage space problems that led to out-of-date dumps.Andrew laid the groundwork for wikitech to be updated via the standard WMF deployment system. We’re investigating the upstream OpenStack user interface,‘horizon’ http://horizon-test.wmflabs.org/horizon. Features Engineering https://www.mediawiki.org/wiki/Wikimedia_Features_engineeringEditor retention: Editing tools
*VisualEditor https://www.mediawiki.org/wiki/VisualEditor* In August, the team working on VisualEditor presented about VisualEditor at Wikimania 2014 https://meta.wikimedia.org/wiki/Wikimania/2014, worked with a number of volunteers at the hackathon, adjusted key workflows for template and citation editing, made major progress on Internet Explorer support, and fixed over 40 bugs and tickets https://bugzilla.wikimedia.org/buglist.cgi?list_id=341542&order=priority%2Cbug_severity&product=VisualEditor&query_format=advanced&resolution=FIXED&target_milestone=VE-deploy-2014-08-14&target_milestone=VE-deploy-2014-08-21&target_milestone=VE-deploy-2014-08-28 .
Users of Internet Explorer 11, who we were previously preventing from using VisualEditor due to some major bugs, will now be able to use VisualEditor. Support for earlier versions of Internet Explorer will be coming shortly. Similarly, tablet users browsing the site’s mobile mode now have the option of using a mobile-specific form of VisualEditor. More editing tools, and availability of VisualEditor on phones, is planned for the future.
Improvements and updates were made to a number of interface messages as part of our work with translators to improve the software for all users, and VisualEditor and MediaWiki were improved to support highlighting links to disambiguation pages where a wiki or user wishes to do so. Several performance improvements were made, especially to the system around re-using references and reference lists. We tweaked the link editor’s behaviour based on feedback from users and user testing https://www.mediawiki.org/wiki/VisualEditor/Design/User_testing. The deployed version of the code was updated three times in the regular release cycle (1.24-wmf17 https://www.mediawiki.org/wiki/MediaWiki_1.24/wmf17#VisualEditor, 1.24-wmf18 https://www.mediawiki.org/wiki/MediaWiki_1.24/wmf18#VisualEditor and 1.24-wmf19 https://www.mediawiki.org/wiki/MediaWiki_1.24/wmf19#VisualEditor).
*Editing https://www.mediawiki.org/wiki/Editing* In August, the Editing Team presented at Wikimania 2014 https://meta.wikimedia.org/wiki/Wikimania/2014 on better ways to develop and manage front-end software, improved the infrastructure of the key user interface libraries, and continued the planned adjustments to the MediaWiki skins system https://www.mediawiki.org/wiki/Requests_for_comment/Redo_skin_framework.
The TemplateData GUI editor was significantly improved, including being updated to use the new types, and recursive importing of parameters if needed, and deployed on Norwegian Bokmål Wikipedia. The volunteers working on the Math extension (for formulæ) moved closer to deploying the “Mathoid” server that will use MathJax to render clearer formulæ than with the current versions.
The Editing team as usual did a lot of work on improving libraries and infrastructure. The OOjs UI https://www.mediawiki.org/wiki/OOjs_UI library was modified to make the isolation of dialogs using <iframe>s optional, and re-organise the theme system as part of implementing a new look-and-feel for OOUI, to make it consistent with the planned changes to the MediaWiki design, in collaboration with the Design team. The OOjs https://www.mediawiki.org/wiki/OOjs library was updated to fix a minor bug, with two new versions (v1.0.12 and then v1.1.0) released and pushed downstream into MediaWiki, VisualEditor and OOjs UI.
*Parsoid https://www.mediawiki.org/wiki/Parsoid* In August, we wrapped up our face-to-face off-site meetup in Mallorca and attended Wikimania in London, which was the first Wikimania event for us all. At the Wikimania hackathon, we co-presented (with the Services team) a workshop session about Parsoid and how to use it. We also had a talk at Wikimania about Parsoid.
The GSoC 2014 LintTrap project wrapped up and we hope to develop this further over the coming months, and go live with it later this year.
With an eye towards supporting Parsoid-driven page views, the Parsoid team worked on a few different tracks. We deployed the visual diff mass testing service http://parsoid-tests.wikimedia.org/visualdiff/, we added Tidy support to parser tests and updated tests, which now makes it easy for Parsoid to target the PHP Parser + Tidy combo found in production, and continued to make CSS and other fixes. Services
*Services and REST API https://www.mediawiki.org/wiki/Services* August was mostly a month of travel and vacation for the service team. We deployed a first prototype of the RESTBase https://github.com/gwicke/restbase storage and API service in Labs http://api.wmflabs.org/v1/en.wikipedia.org/pages/. We also presented on both Parsoid and RESTBase at Wikimania, which was well received. Later in August, computer science student Hardik Juneja joined the team as a part-time contractor. Working from Mumbai, he dived straight into complex secondary index update algorithms in the Cassandra back-end. At the end of the month, design work resumed, with the goal of making RESTBase easier to extend with additional entry points and bucket types. Core Features
*Flow https://www.mediawiki.org/wiki/Flow/Project_information* In August, the Flow team created a new read/unread state for Flow notifications, to help users keep track of the active discussion topics that they’re subscribed to. There are now two tabs in the Echo notification dropdown, split between Messages (Flow notifications) and Alerts (all of the other Echo notifications). Flow notifications stay unread until the user clicks on the item and visits the topic page, or marks the item as read in the notifications panel. The dropdown is also scrollable now, and holds the 25 most recent notifications. Last, subscribing to a Flow board gives the user a notification when a new topic is created on the board. Growth
*Growth https://www.mediawiki.org/wiki/Growth* In August, the Growth team vetted CirrusSearch https://meta.wikimedia.org/wiki/Research:Task_recommendations/Qualitative_evaluation_of_morelike as back-end for personalized suggestions and prepared its first A/B test https://meta.wikimedia.org/wiki/Research:Task_recommendations/Experiment_one of the new task recommendations https://www.mediawiki.org/wiki/Task_recommendations system. This test will deliver recommendations to a random sample of newly-registered users on 12 Wikipedias: English, French, German, Spanish, Italian, Hebrew, Persian, Russian, Ukrainian, Swedish, and Chinese. Several Growth team members also attended Wikimania 2014 in London. At Wikimania, the team shared presentations on its work and conducted usability tests https://www.mediawiki.org/wiki/Task_recommendations/Usability_testing of the recommendations system. Last but not least, design work began on the third major iteration of the team’s anonymous editor acquisition https://www.mediawiki.org/wiki/Anonymous_editor_acquisition/Signup_invites_v3 project. Mobile https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering
*Wikimedia Apps https://www.mediawiki.org/wiki/Wikimedia_Apps* In August, the Mobile Apps Team focussed on bug fixes for the recently released iOS app and for the Android app, as well as gathering user feedback from Wikimania. The team also had unstructured time during Wikimania, in which the engineers are free to work on whatever they fancy. This resulted in numerous code quality improvements on both iOS and Android. On iOS, the unstructured time also spawned a preliminary version of the feature “Nearby”, which lists articles about things that are near you, tells you how near they are to you, and points towards them. On Android, the unstructured time spawned a preliminary version of full text search, an improved searching experience which aims to present more relevant results.
*Mobile web projects https://www.mediawiki.org/wiki/Mobile_web_projects* This month the mobile web team, in partnership with the Editing team, launched a mobile-friendly opt-in VisualEditor for users of the mobile site on tablets. Tablet users can now choose to switch from the default editing experience (wikitext editor) to a lightweight version of VE featuring some common formatting tools (bold and italic text, the ability to add/edit links and references). We also began building a Wikidata contribution game in alpha that will allow users to add metadata to the Wikidata database (to start, occupations of people) directly from the Wikipedia article where the information is contained. We hope to graduate this feature to the beta site next month to get more quantitative feedback on its usage and the quality of contributions.
*Wikipedia Zero & Partnerships* Wikipedia Zero page views held steady at around 70 million in August. We launched Wikipedia Zero with three operators: Smart and Sun in the Philippines (related companies) and Timor Telecom in East Timor. That brings our total numbers to 37 partners in 31 countries. Smart has been collaborating with Wikimedia Philippines for months, and they previously offered free access to Wikipedia on a trial basis. Just announced http://www1.smart.com.ph/About/newsroom/press-releases/2014/09/04/smart-wikimedia-foundation-launch-wikipedia-zero-in-ph----initiative-to-give-70-million-filipinos-free-access-to-knowledge, Smart has now officially joined Wikipedia Zero and brought in their sister brand Sun, covering a combined 70 million subscribers in the Philippines. Timor Telecomlaunched http://www.timortelecom.tl/index.php?option=com_content&view=article&id=300%3Aparceria-com-a-wikimedia-foundation&catid=65%3Anoticias&Itemid=113&lang=tl Wikipedia Zero with a press event including the Vice Minister of Education and much promotion. https://www.youtube.com/watch?v=9YTcbBY9B-M Timor Telecom is keen to support growth in the Tetun Wikipedia by raising awareness in universities, with resources from the Wikipedia Education Program https://wikimediafoundation.org/wiki/Wikipedia_Education_Program. In Latin America, we made progress toward app preloads by completing testing for the Qualcomm Reference Design (QRD) https://qrd.qualcomm.com/ program. The Wikipedia Android app is now certified for preload on QRD. We made terrific connections with Global South community members at Wikimania, which will lead to more direct local collaboration between partners and Wikimedia communities. Smriti Gupta, partnerships manager for Asia, moved to India where she will work remotely. We’re recruiting our third partnerships manager http://boards.greenhouse.io/wikimedia/jobs/19649?t=v75upy#.VAh-8WRdXEs to cover South East Asia and tech partnerships. Language Engineering https://www.mediawiki.org/wiki/Wikimedia_Language_engineering
*Language tools https://www.mediawiki.org/wiki/Language_tools* Niklas Laxström (outside his WMF job) completed most of the work needed in Translate to Recover gracefully from session expiration https://bugzilla.wikimedia.org/show_bug.cgi?id=69314, a known pain point for translators. The PageMigration feature (a GSoC project mentored by Niklas) was (GSoC project mentored by Niklas) released . The team also worked on session expiry checking (to prevent errors in long translations), updated YAML handling, deployed auto-translated screenshots for the VisualEditor user guide https://www.mediawiki.org/wiki/Help:VisualEditor/User_guide (a GSoC project mentored by Amir and done by Vikas Yaligar). They did internationalization testing of the new Android and iOS apps, as well as internationalization testing and bug fixes in VisualEditor, MobileFrontend and Flow.
*Milkshake https://www.mediawiki.org/wiki/Milkshake* Webfonts were enabled on the English Wikisource https://bugzilla.wikimedia.org/show_bug.cgi?id=69655 and Divehi wikis https://bugzilla.wikimedia.org/show_bug.cgi?id=69860, following requests from the respective communities.
*Language Engineering Communications and Outreach https://www.mediawiki.org/wiki/Language_engineering_communications_and_outreach* The team was at Wikimania in London. Santhosh Thottingal and Amir Aharoni presented on *Machine-aided machine translation* http://www.youtube.com/watch?v=b6qvv3eJ_Ag&t=32m40s, and Runa Bhattacharjee and Kartik Mistry on *Testing multilingual applications* http://www.youtube.com/watch?v=0hcjZvateZs&t=32m15s*.*They conducted user testing for ContentTranslation in several languages (Catalan, Spanish, Kazakh, Russian, Bengali, Hebrew, Arabic), continued conversations with translators from Wikipedias in several languages, and published a retrospective https://blog.wikimedia.org/2014/08/26/content-translation-100-published-articles-and-more-to-come/ on ContentTranslation and Wikimania.
*Content translation https://www.mediawiki.org/wiki/Content_translation* achine translation abuse algorithm was redone. The team also worked on reference adaptation improvements, refactoring the front-end event architecture and rewriting the cxserver registry to support multiple machine translation engines. Platform Engineering https://www.mediawiki.org/wiki/Wikimedia_Platform_EngineeringMediaWiki Core
*HHVM https://www.mediawiki.org/wiki/HHVM* We migrated test.wikipedia.org to HHVM in early August and saw very few issues. Giuseppe shared some promising benchmarks https://lists.wikimedia.org/pipermail/wikitech-l/2014-August/078034.html. Re-imaging an app server was surprisingly painful, in that Giuseppe and Ori had to perform a number of manual actions to get the server up-and-running, and this sequence of steps was poorly automated. Doing this much manual work per app server isn’t viable.
Mark submitted a series of patches to create a service IP and Varnish back-end for an HHVM app server pool, with Giuseppe and Brandon providing feedback and support. The patch routes requests tagged with a specific cookie to the HHVM back-ends. Tech-savvy editors were invited to opt-in to help with testing by setting the cookie explicitly. The next step after that will be to divert a fraction of general site traffic to those back-ends. The exact date will depend on how many bugs the next round of testing uncovers.
Tim is looking at modifying the profiling feature of LuaSandbox to work with HHVM; it is currently disabled.
*Admin tools development https://www.mediawiki.org/wiki/Admin_tools_development* Most admin tools resources are currently directed towards SUL finalisation https://www.mediawiki.org/wiki/SUL_finalisation. There was a roundtable https://wikimania2014.wikimedia.org/wiki/Submissions/Roundtable:_Admin_tools_development at Wikimania with developers and admins/tool users discussing some issues they’ve had, and feature requests they would like to see implemented. The GlobalCssJs https://www.mediawiki.org/wiki/Extension:GlobalCssJs extension was deployed to all public Wikimedia wikis, allowing for proper user global CSS and JS.
*Search https://www.mediawiki.org/wiki/Search* tarted deploying Cirrus as the primary search back-end to more of the remaining wikis and we found what looks like our biggest open performance bottleneck. Next month’s goal is to fix it and deploy to more wikis (probably not all). We’re also working on getting more hardware.
*SUL finalisation https://www.mediawiki.org/wiki/SUL_finalisation* The SUL finalisation team continues to work on building tools to support the finalisation. There are four ongoing streams of work, and the team is on track to have the majority of the work completed by the end of September.
The ability to globally rename users was deployed a while ago, and is currently working excellently!
The ability to log in with old, pre-finalisation credentials has been developed so that users are not inadvertently locked out of their accounts.
From an engineering standpoint, this form is now fully working in our test
environment. Right now, the form uses placeholder text; that text needs to be ‘prettified’ so that the users who have been forcibly renamed get the appropriate information on how to proceed after their rename, and more rigorous testing should be done before deployment.
A form to globally merge users has been developed so that users can consolidate their accounts after the finalisation. From an engineering standpoint, this form is now fully working in our test environment. The form needs design improvements and further testing before it can be deployed.
A form to request a rename has been developed so that users who do not have global accounts can request a rename, and also so that the workload on the renamers is reduced. From an engineering standpoint, the form to request a rename has been implemented, and implementation has begun on the form that allows renames to rename users. Once the end-to-end experience has been fully implemented and tested, the form will be ‘prettified’.
*Security auditing and response https://www.mediawiki.org/wiki/Security_auditing_and_response* ecurity reviews of the Graph, WikibaseQuery and WikibaseQueryEngine extensions. Initial work was done to enable regular dynamic security scanning. Release Engineering https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team
*Quality Assurance https://www.mediawiki.org/wiki/Quality_Assurance* Having completed the migration of our Continuous Integration infrastructure from a third party host to Wikimedia’s own Jenkins instance, we are thinking about improvements and changes for future work. We aim to improve performance for Jenkins and also for beta labs. We are looking into creating other shared test environments along with beta labs to better support changes like we did this month with HHVM and with a security and performance test project. We also continue to improve the development experience with Vagrant and other virtual machine technologies.
*Browser testing https://www.mediawiki.org/wiki/Quality_Assurance/Browser_testing* This month, we continued to build out and adjust the new browser test builds on Jenkins. We saw updates to tests and issues identified for UploadWizard, VisualEditor, Echo, and MobileFrontend. New tests for GettingStarted pointed out a need to update our Redis storage on the beta cluster. We are currently monitoring an upstream problem with Selenium/Webdriver and IE11 on behalf of VisualEditor, as VE support for IE11 is coming soon. Multimedia
*Multimedia https://www.mediawiki.org/wiki/Multimedia* https://commons.wikimedia.org/wiki/File:Media_Viewer_-_New_Design_Proposal_-_Rapa_Nui.png
https://commons.wikimedia.org/wiki/File:Media_Viewer_-_New_Design_Proposal_-_Rapa_Nui.png
Media Viewer’s new ‘minimal design’.
In August, the multimedia team had extensive discussions with community members about the various projects we are working on. We started with seven differentroundtable discussions and presentations https://wikimania2014.wikimedia.org/wiki/Multimedia_Events at Wikimania 2014 in London, including sessions on:Upload Wizard https://wikimania2014.wikimedia.org/wiki/Hackathon#Upload_Wizard_Roundtable , Structured Data https://wikimania2014.wikimedia.org/wiki/Hackathon#Structured_Data_Roundtable ,Media Viewer https://wikimania2014.wikimedia.org/wiki/Hackathon#Media_Viewer_Roundtable , Multimedia https://wikimania2014.wikimedia.org/wiki/Submissions/Multimedia_Roundtable ,Community https://wikimania2014.wikimedia.org/wiki/Multimedia_Community_Meetup and Kindness https://wikimania2014.wikimedia.org/wiki/Submissions/A_Culture_of_Kindness. To address issues raised in recent Requests for Comments, we also hosted a one-week Media Viewer Consultation https://meta.wikimedia.org/wiki/Community_Engagement_(Product)/Media_Viewer_consultation, inviting suggestions from community members across our sites.
The team also worked to make Media Viewer https://www.mediawiki.org/wiki/Multimedia/About_Media_Viewer easier to use by readers and casual editors, our primary target users for this tool. To that end, we created a new ‘minimal design’ including a number of new improvements https://www.mediawiki.org/wiki/Multimedia/Media_Viewer/Improvementssuch as a more prominent button linking to the File: page, an easier way to enlarge images and more informative captions. These new features were prototyped http://multimedia-alpha.wmflabs.org/wiki/Rapa_Nui_National_Park and carefully tested this month to validate their effectiveness. Testers completed easily most of tasks we gave them, suggesting that the new features are now usable by target users, and ready for development in September.
This month, we prepared a first plan for the Structured Data https://commons.wikimedia.org/wiki/Commons:Structured_data project, in collaboration with many community members and the Wikidata team https://www.wikidata.org/wiki/: we propose to gradually implement machine-readable data on Wikimedia Commons, starting with small experiments in the fall, followed by a wider deployment in 2015. We also continued our code refactoring for theUploadWizard https://www.mediawiki.org/wiki/UploadWizard, as well as fixed more bugs across our multimedia platform. To keep up with our work, join the multimedia mailing list https://lists.wikimedia.org/mailman/listinfo/multimedia. Engineering Community Team https://www.mediawiki.org/wiki/Engineering_Community_Team
*Bug management https://www.mediawiki.org/wiki/Bug_management* Daniel made Bugzilla use ssl_ciphersuite to add HSTS https://gerrit.wikimedia.org/r/#/c/154978/ and removed a superfluous STS header setting https://gerrit.wikimedia.org/r/#/c/155649/. Andre worked around a Bugzilla XML RPC API issue https://bugzilla.wikimedia.org/show_bug.cgi?id=69747 which created problems for exporting Bugzilla data for a Phabricator import. In Bugzilla’s taxonomy (components, descriptions, default CCs, etc.) some smaller changes took place https://bugzilla.wikimedia.org/buglist.cgi?bug_id=69882,69696,69631,69427,69343,69339,69011,69009,68318,64621,54871,70129,70192 .
*Phabricator migration https://www.mediawiki.org/wiki/Phabricator/Migration* The project is getting close to Day 1 of a Wikimedia Phabricator production instance. For better overview and tracking, the Wikimedia Phabricator Day 1 project was split into three projects: Day 1 of a Phabricator Production instance in use https://phabricator.wikimedia.org/tag/phabricator-production-instance, Bugzilla migration https://phabricator.wikimedia.org/tag/bugzilla-migration, and RT migration https://phabricator.wikimedia.org/tag/rt-migration/. Furthermore, the overall schedule was clarified https://phabricator.wikimedia.org/T174. In the last month, Security/permission related requirements got implemented (granular file permissions and upload defaults https://secure.phabricator.com/T4589, enforcing that policy https://phabricator.wikimedia.org/T298,making file data inaccessible and not only undiscoverable) https://secure.phabricator.com/T5685. In upstream, Mukunda added API to create projects https://phabricator.wikimedia.org/T318 and Chase added support for mailing lists as watching users https://secure.phabricator.com/D10193. Chase worked on and tested the security https://phabricator.wikimedia.org/T50and data migration https://phabricator.wikimedia.org/T259 logic. Mukunda continued to work on getting theMediaWiki OAuth provider merged into upstream https://phabricator.wikimedia.org/T368. Chase and Mukunda also worked on the Project Policy Enforcer action for Herald https://gerrit.wikimedia.org/r/#/c/154850, providing a user-friendly dropdown menu to restrict ticket access when creating the ticket. A separate domain for user content was purchased https://phabricator.wikimedia.org/T367. Chase also worked on the scripts to export and import data https://git.wikimedia.org/tree/phabricator%2Ftools.git between the systems andsupport for external users in Phabricator https://phabricator.wikimedia.org/T52 and the related mail setup https://phabricator.wikimedia.org/T244. Chase and Chad also took a look at setting up Elasticsearch for Phabricator https://phabricator.wikimedia.org/T378.
*Mentorship programs https://www.mediawiki.org/wiki/Mentorship_programs* All Google Summer of Code https://www.mediawiki.org/wiki/Google_Summer_of_Code_2014 and FOSS Outreach Program for Women https://www.mediawiki.org/wiki/FOSS_Outreach_Program_for_Women/Round_8were evaluated by their mentors as PASSED, although many were still waiting for completion, code reviews and merges. We hosted a wrap-up IRC meeting https://meta.wikimedia.org/wiki/GSoC_%26_FOSS_OPW_wrap-up_meeting with the participation of all teams except one. We are still waiting for some final reports from the interns. In the meantime, you can check their weekly reports:
- Tools for mass migration of legacy translated wiki content https://www.mediawiki.org/wiki/Extension:Translate/Mass_migration_tools/Project_updates - Wikidata annotation tool https://www.mediawiki.org/wiki/Wikidata_annotation_tool/updates - Email bounce handling to MediaWiki with VERP https://www.mediawiki.org/wiki/VERP/GSOC_Progress_Rerport - Google Books, Internet Archive, Commons upload cycle https://www.mediawiki.org/wiki/Google_Books,_Internet_Archive,_Commons_upload_cycle/Progress - UniversalLanguageSelector fonts for Chinese wikis https://www.mediawiki.org/wiki/Extension:UniversalLanguageSelector/Fonts_for_Chinese_wikis#Weekly_Report - MassMessage page input list improvements https://www.mediawiki.org/wiki/Extension:MassMessage/Page_input_list_improvements/Progress_reports - Book management in Wikibooks/Wikisource https://meta.wikimedia.org/wiki/Book_management_2014/Progress - Parsoid-based online-detection of broken wikitext https://www.mediawiki.org/wiki/User:Hardik95/GSoC_2014_Progress_Report - Usability improvements for the Translate extension https://www.mediawiki.org/wiki/User:Kunalgrover05/Progress_Report - A modern, scalable and attractive skin for MediaWiki https://www.mediawiki.org/wiki/User:Jack_Phoenix/GSoC_2014 - Automatic cross-language screenshots for user documentation https://www.mediawiki.org/wiki/Automatic_cross-language_screenshots/progress - Separating skins from core MediaWiki https://www.mediawiki.org/wiki/Separating_skins_from_core_MediaWiki/Progress - Chemical Markup support for Wikimedia Commons https://www.mediawiki.org/wiki/Chemical_Markup_support_for_Wikimedia_Commons/Internship_Report - Improving URL citations on Wikimedia https://www.mediawiki.org/wiki/User:Mvolz/Weekly_Reports - Historical OpenStreetMap https://www.mediawiki.org/wiki/User:JaimeLyn/Weekly_Reports - Welcoming new contributors to Wikimedia Labs and Tool Labs https://www.mediawiki.org/wiki/Welcome_to_labs/Progress_Reports - Evaluating, documenting, and improving MediaWiki web API client libraries https://www.mediawiki.org/wiki/Evaluating_and_Improving_MediaWiki_web_API_client_libraries/Progress_Reports - Feed the Gnomes – Wikidata Outreach https://www.mediawiki.org/wiki/User:Thepwnco/OPW_Reporting - Template Matching for RDFIO https://www.mediawiki.org/wiki/Extension:RDFIO/Template_matching_for_RDFIO/Reports - Switching Semantic Forms Autocompletion to Select2 https://www.mediawiki.org/wiki/Extension:Semantic_Forms/Select2_for_autocompletion/Progress_Report - Catalogue for Mediawiki Extensions https://www.mediawiki.org/wiki/User:Adi.iiita/Gsoc2014/Report#Weekly_Report - Generic, efficient localisation update service https://www.mediawiki.org/wiki/Extension:LocalisationUpdate/LUv2/Updates .
*Technical communications https://www.mediawiki.org/wiki/Technical_communications* In August, Guillaume Paumier https://www.mediawiki.org/wiki/User:Guillaume_(WMF) attended the Wikimania conference and the associated hackathon. He gave a talk https://wikimania2014.wikimedia.org/wiki/Submissions/Tech_news about Tech News https://meta.wikimedia.org/wiki/Tech/News (video available on YouTube https://www.youtube.com/watch?v=rqGDTNkVgLI&list=UURXe4cgJPTVHcDH6ZGwOT3A#t=9m15s) and created a poster summarizing the talk. He also continued to write and distribute Tech News every week, and started to contribute to the Structured data https://commons.wikimedia.org/wiki/Commons:Structured_data project.
*Volunteer coordination and outreach https://www.mediawiki.org/wiki/Volunteer_coordination_and_outreach* We ran the Wikimania Hackathon https://wikimania2014.wikimedia.org/wiki/Hackathon in an unconference manner together with the Wikimania organizers. The event went well in a unique venue, and we are compiling a list of lessons learned to be applied in future events. Together with other former organizers of hackathons, we decided that the next Wikimedia Hackathon in Europe will be organized by Wikimedia France (details coming soon). Also at Wikimania, Quim Gil gave a talk about The Wikimedia Open Source Project and You https://wikimania2014.wikimedia.org/wiki/Submissions/The_Wikimedia_open_source_project_and_you (video https://www.youtube.com/watch?v=c5tJdQCnGWQ&list=UURXe4cgJPTVHcDH6ZGwOT3A#t=3211 –slides https://commons.wikimedia.org/wiki/File:The_Wikimedia_Open_Source_Project_and_You.pdf ). Analytics https://www.mediawiki.org/wiki/Analytics
*Wikimetrics https://www.mediawiki.org/wiki/Analytics/Wikimetrics*
Following the prototype built for Wikimania, the team identified many performance issues in Wikimetrics for backfilling Editor Engagement Vital Signs (EEVS) data. The team spent a sprint implementing some performance enhancements as well as properly managing sessions with the databases. Wikimetrics is better at running recurring reports concurrently and managing replication lag in the slave DBs.
*Data Processing https://www.mediawiki.org/wiki/Analytics/Data_Processing*
The team continued monitoring analytics systems and responding to issues when [non-critical] alarms in went off. Packet losses and kafka issues were diagnosed and handled.
Hadoop worker nodes now automatically set memory limits according to what is available. Previously all workers had the same fixed limit. This allows for better resource utilization.
Logstash is now available at https://logstash.wikimedia.org (Wikitech account required). Logs from Hadoop are piped there for easier search and diagnosis of Hadoop jobs.
Some uses of udp2log were migrated to kafkatee. The latter is not prone to packet losses. In particular Webstatscollector was switched over and error rates were seen to drop drastically. Eventually, the “collecting” part of Webstatscollector will be implemented in Hadoop, a much more scalable environment to handle such work.
*Editor Engagement Vital Signs https://www.mediawiki.org/wiki/Analytics/Editor_Engagement_Vital_Signs*
The team implemented the stack necessary to load EEVS in a browser and has a rough implementation of the UI according to Pau’s design http://pauginer.github.io/prototypes/analytics-dashboard/index.html . The team also made available to EEVS two metrics already implemented on Wikimetrics: number of pages created, and number of edits.
*Research and Data https://www.mediawiki.org/wiki/Analytics/Research_and_Data* This month we hosted the WikiResearch hackathon https://meta.wikimedia.org/wiki/Research:Labs2/Hackathons/August_6-7th,_2014, a dedicated research track of the Wikimania hackathon. 3 demos https://www.youtube.com/watch?v=vhRifTmPfSI of research code libraries were broadcast during the event and several research ideas https://meta.wikimedia.org/wiki/Category:Labs2_project_ideasfiled on Meta. Highlights from the hackathon include: Quarry http://quarry.wmflabs.org/ (a web client to query Wikimedia’s slave databases on Labs); wpstubs https://github.com/theopolisme/wpstubs (a social media bot broadcasting https://twitter.com/wpstubs newly categorized stubs on the English Wikipedia); an algorithmic classification https://meta.wikimedia.org/wiki/Research:Screening_WikiProject_Medicine_articles_for_quality of articles due to be re-assessed from the English Wikipedia WikiProject Medicine’s stubs.
We gave or participated in 8 presentations https://www.mediawiki.org/wiki/Analytics/Wikimania_2014 during the main conference.
We published a report on mobile trends https://meta.wikimedia.org/wiki/Research:Mobile_trends expanding the data https://commons.wikimedia.org/wiki/File:Wikimedia_Mobile_Trends.pdf presented at the July 2014 Monthly Metrics meeting. We started work on referral parsing https://github.com/Ironholds/Rferer from request log data to study trends in referred traffic over time.
We generated sample data https://trello.com/c/VW6LMOrq/325-edit-conflict-instrumentation of edit conflicts and worked on scripts for robust revert detection. We published traffic data http://stats.wikimedia.org/wikimedia/pageviews/categorized/wp-medicin/WikiProject_Medicine_Translations_2014-07.html for the Medicine Translation Taskforce, with a particular focus on traffic to articles related to Ebola https://docs.google.com/a/wikimedia.org/document/d/1mSw9kldXtv5tiDk24s1v8muhente-pWHsFHV7USvS_o/edit .
We wrote up a research proposal for task recommendations https://meta.wikimedia.org/wiki/Research:Task_recommendations in support of the Growth team’s experiments on recommender systems. We analyzed qualitative data to assess the performance of Cirrus Search “morelike” feature for identifying articles in similar topic areas https://meta.wikimedia.org/wiki/Research:Task_recommendations/Qualitative_evaluation_of_morelike. We provided support for the experimental design of a first test of task recommendations https://meta.wikimedia.org/wiki/Research:Task_recommendations/Experiment_one. We performed an analysis of the result of the second experiment on anonymous editor acquisition https://meta.wikimedia.org/wiki/Research:Asking_anonymous_editors_to_register/Study_2 run by the Growth team.
We hosted the August 2014 research showcase https://meta.wikimedia.org/wiki/Analytics/Research_and_Data/Showcase#August_2014 with a presentation by Oliver Keyes on circadian patterns in mobile readership https://commons.wikimedia.org/wiki/File:Everything_You_Know_About_Mobile_Is_Wrong.pdf and a guest talk by Morten Warncke-Wang on quality assessment and task recommendations in Wikipedia https://commons.wikimedia.org/wiki/File:Wikipedia_Article_Curation_-_Understanding_Quality,_Recommending_Tasks_(WMF_Research_Showcase_Aug_2014).pdf .
We also gave presentations on Wikimedia research at the Oxford Internet Institute, INRIA, Wikimedia Deutschland (slides https://commons.wikimedia.org/wiki/File:Managing_Open_Production_at_Scale_(slides).pdf) and at the Public Library of Science (slides http://www.slideshare.net/dartar/crossing-the-streams-social-and-technical-interfaces-between-wikimedia-and-open-access-publishing). Aaron Halfaker presented at OpenSym 2014 a paper he co-authored on the impact of the Article for Creation workflow on newbies (slides https://commons.wikimedia.org/wiki/File:Accept,_Decline,_Postpone_(OpenSym%2714_presentation_slides).pdf , fulltext http://www.opensym.org/os2014-files/proceedings/p602.pdf). Wikidata https://meta.wikimedia.org/wiki/Wikidata
*The Wikidata project is funded and executed by Wikimedia Deutschland https://meta.wikimedia.org/wiki/Wikimedia_Deutschland/en.* August was a very busy month for Wikidata. The main page was redesigned and is now much more inviting and useful. A lot of new features were finished and deployed. Among them are:
- Redirects: allowing you to turn an item into a redirect. - Monolingual text datatype: allowing you to enter new kinds of data like the motto of a country. - Badges: allowing you to store badges for articles on Wikidata. This includes “featured article” and “good article”. More will be added soon. - *In other projects* sidebar as a beta feature: allowing you to show links to sister projects in the sidebar of any article. - Special:GoToLinkedPage: allowing you to go to a Wikipedia page based on its Wikidata Q-ID. This will be especially useful if you want to create links to articles that don’t change even if the article is moved. - Wikinews: Wikinews has been added as a supported sister project. Wikinews can now maintain their sitelinks on Wikidata. Access to the other data will follow in due time. - Wikidata: Sitelinks to pages on Wikidata itself can now also be stored on Wikidata. This is useful to connect for example its help pages with those on the other projects. - Change of the internal serialization format: The internal serialization format changed to be consistent with the serialization format that is returned by the API.
In addition, the team worked on a lot of under-the-hood changes towards the new user interface design and started the discussions around structured data support for Commons. The log of the IRC office hour https://meta.wikimedia.org/wiki/IRC_office_hours/Office_hours_2014-09-03 is available. FutureThe engineering management team continues to update the *Deployments https://wikitech.wikimedia.org/wiki/Deployments*page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the *annual goals https://www.mediawiki.org/wiki/Wikimedia_Engineering/2014-15_Goals*, listing ongoing and future Wikimedia engineering efforts. ------------------------------ *This article was written collaboratively by Wikimedia engineers and managers. See revision history https://www.mediawiki.org/w/index.php?title=Wikimedia_engineering_report/2014/August&action=history and associated status pages. A wiki version https://www.mediawiki.org/wiki/Wikimedia_engineering_report/2014/August is also available.*