Wikitech-l July 2017

wikitech-l@lists.wikimedia.org

57 participants
59 discussions

How to make commits from mirrored Wikimedia repos show up on your GitHub profile
by Bartosz Dziewoński 27 Jun '24

27 Jun '24

I know it has been annoying a couple of people other than me, so now that I've learned how to make it work I'll share the knowledge here. tl;dr: Star the repositories. No, seriously. (And yes, you need to star each extension repo separately.) (Is there a place on mw.org to put this tidbit on?) ------- Forwarded message ------- From: "Brian Levine" <support(a)github.com> (GitHub Staff) To: matma.rex(a)gmail.com Cc: Subject: Re: Commits in mirrored repositories not showing up on my profile Date: Tue, 09 Jul 2013 06:47:19 +0200 Hi Bartosz In order to link your commits to your GitHub account, you need to have some association with the repository other than authoring the commit. Usually, having push access gives you that connection. In this case, you don't have push permission, so we don't link you to the commit. The easy solution here is for you to star the repository. If you star it - along with the other repositories that are giving you this problem - we'll see that you're connected to the repository and you'll get contribution credit for those commits. Cheers Brian -- Matma Rex

3 3

Research FAQ gets a facelift
by Dario Taraborelli 25 Jun '24

25 Jun '24

We just released a new version of Research:FAQ on Meta [1], significantly expanded and updated, to make our processes at WMF more transparent and to meet an explicit FDC request to clarify the role and responsibilities of individual teams involved in research across the organization. The previous version – written from the perspective of the (now inactive) Research:Committee, and mostly obsolete since the release of WMF's open access policy [2] – can still be found here [3]. Comments and bold edits to the new version of the document are welcome. For any question or concern, you can drop me a line or ping my username on-wiki. Thanks, Dario [1] https://meta.wikimedia.org/wiki/Research:FAQ [2] https://wikimediafoundation.org/wiki/Open_access_policy [3] https://meta.wikimedia.org/w/index.php?title=Research:FAQ&oldid=15176953 *Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter <http://twitter.com/readermeter>

2 1

bluejeans
by Jeremy Baron 22 Feb '23

22 Feb '23

Hi, On Tue, Mar 1, 2016 at 3:36 PM, David Strine <dstrine(a)wikimedia.org> wrote: > We will be holding this brownbag in 25 minutes. The Bluejeans link has > changed: > > https://bluejeans.com/396234560 I'm not familiar with bluejeans and maybe have missed a transition because I wasn't paying enough attention. is this some kind of experiment? have all meetings transitioned to this service? anyway, my immediate question at the moment is how do you join without sharing your microphone and camera? am I correct thinking that this is an entirely proprietary stack that's neither gratis nor libre and has no on-premise (not cloud) hosting option? are we paying for this? -Jeremy

9 16

New DB_REPLICA constant; DB_SLAVE deprecated
by Aaron Schulz 04 Jul '20

04 Jul '20

As of 950cf6016c, the mediawiki/core repo was updated to use DB_REPLICA instead of DB_SLAVE, with the old constant left as an alias. This is part of a string of commits that cleaned up the mixed use of "replica" and "slave" by sticking to the former. Extensions have not been mass converted. Please use the new constant in any new code. The word "replica" is a bit more indicative of a broader range of DB setups*, is used by a range of large companies**, and is more neutral in connotations. Drupal and Django made similar updates (even replacing the word "master"): * https://www.drupal.org/node/2275877 * https://github.com/django/django/pull/2692/files & https://github.com/django/django/commit/beec05686ccc3bee8461f9a5a02c607a023… I don't plan on doing anything to DB_MASTER, since it seems fine by itself, like "master copy", "master tape" or "master key". This is analogous to a master RDBMs database. Even multi-master RDBMs systems tend to have a stronger consistency than classic RDBMs slave servers, and present themselves as one logical "master" or "authoritative" copy. Even in it's personified form, a "master" database can readily be thought of as analogous to "controller", "governer", "ruler", lead "officer", or such.** * clusters using two-phase commit, galera using certification-based replication, multi-master circular replication, ect... ** https://en.wikipedia.org/wiki/Master/slave_(technology)#Appropriateness_of_… *** http://www.merriam-webster.com/dictionary/master?utm_campaign=sd&utm_medium… -- -Aaron

4 3

Roadmap for CX?
by Strainu 06 Sep '18

06 Sep '18

Following the recent outage, we've had a new series of complaints about the lack of improvements in CX, especially related to server-side activities like saving/publishing pages. Now, I know the team is involved in a long-term effort to merge the editor with the VE, but is there an end in sight for that effort? Can I tell people who ask "look, 6 more months then we'll have a much better translation tool"? Is there a publicly available roadmap for this project and more generally, for CX? Thanks, Strainu

5 8

Upgrade of QUnit from 1.x to 2.x is underway
by Krinkle 24 Oct '17

24 Oct '17

TL;DR: MediaWiki core is upgrading its version of QUnit from 1.x to 2.x. This means extensions or skins with QUnit tests must now be compatible with 2.x. See https://phabricator.wikimedia.org/T170515 and https://qunitjs.com/upgrade-guide-2.x/. Hi all, ### Deprecated API In 2014, QUnit started to overhaul its API, to be more robust and better support async workflows. The most notable change was the removal of global and static functions, in favour of more contextual methods. The first part of this released in 1.15, and more was gradually introduced in later releases. The vast majority of our codebases are already using the new interfaces. In fact, the vast majority of our QUnit tests were written after 2014 and never used the old interfaces in the first place. For a short list of removed functions, see https://phabricator.wikimedia.org/T170515. If you find a QUnit Jenkins job for a MediaWiki extension or skin repo starts failing, it is most likely due to this. Look for errors such as "QUnit.start undefined", "test.callback is not a function", "QUnit.asyncTest is undefined", and "QUnit.push is undefined", There are also some methods that have been deprecated over the past few years. These still work in QUnit 2. Please take a moment to familiarise yourself with the renamed methods and new methods. Doing so will avoid confusion when reading new code that uses them. See https://qunitjs.com/upgrade-guide-2.x/. ### New features The 'setup' and 'teardown' module hooks are now called 'beforeEach' and 'afterEach'. The old names still work, but the new names clarify that these hooks are run for each QUnit.test(). QUnit 2.0 also adds new 'before' and 'after' hooks, which run only once per module. This is somewhat analogous to use of setUpBeforeClass() in PHPUnit. Since QUnit 1.16, QUnit.test() supports returning a Promise from the test callback. This automatically attaches an assert.async() handler and waits for the promise to complete. It also asserts that the Promise will be resolved, and fails the test if rejected. This helps avoid a common pitfal where a test could timeout when forgetting to attach a promise.fail() handler. ## Upgrade The version used on Special:JavaScriptTest has already been upgraded. – https://gerrit.wikimedia.org/r/365757. The copy used for command-line usage (grunt-karma) and Jenkins will be upgraded by https://gerrit.wikimedia.org/r/367838 -- Timo

1 1

Wikiscan statistics tool for Wikimedia projects
by Pine W 14 Aug '17

14 Aug '17

Wikiscan is an interesting tool for statistics fans. I suggest briefly reading this IEG page <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>, then playing with the tool on https://wikiscan.org/ Pine

4 3

More Detailed Browser Stats for Desktop Sites
by Nuria Ruiz 12 Aug '17

12 Aug '17

Hello: Please take a look at the new browser report with more detailed desktop site data (all wikimedia projects agreggated): https://analytics.wikimedia.org/dashboards/browsers/#desktop-site-by-browser Some highlights: * Data is very stable over the last year * Chrome in the lead with 45% of traffic, closely followed by IE (18%) and FF (13%) * The bulk of IE traffic is IE11 and IE7 * Edge shows up with 4% slowly catching up to Safari (5%) * This data is still subject to fluctuations due to bot traffic not identified as such. We will be working on this next year. Thanks, Nuria

4 6

Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018
by Subramanya Sastry 11 Aug '17

11 Aug '17

How to read this post? ---------------------- * For those without time to read lengthy technical emails, read the TL;DR section. * For those who don't care about all the details but want to help with this project, you can read sections 1 and 2 about Tidy, and then skip to section 7. * For those who like all their details, read the post in its entirety, and follow the links. Please ask follow up questions on wiki *on the FAQ’s talk page* [0]. If you find a bug, please report it *on Phabricator or on the page mentioned above*. TL;DR ----- The Parsing team wants to replace Tidy with a RemexHTML-based solution on the Wikimedia cluster by June 2018. This will require editors to fix pages and templates to address wikitext patterns that behave differently with RemexHTML. Please see 'What editors will need to do' section on the Tidy replacement FAQ [1]. 1. What is Tidy? ---------------- Tidy [2] is a library currently used by MediaWiki to fix some HTML errors found in wiki pages. Badly formed markup is common on wiki pages when editors use HTML tags in templates and on the page itself. (Ex: unclosed HTML tags, such as a <small> without a </small>, are common). In some cases, MediaWiki can generate erroneous HTML by itself. If we didn't fix these before sending it to browsers, some would display things in a broken way to readers. But Tidy also does other "cleanup" on its own that is not required for correctness. Ex: it removes empty elements and adds whitespace between HTML tags, which can sometimes change rendering. 2. Why replace it? ------------------ Since Tidy is based on HTML4 semantics and the Web has moved to HTML5, it also makes some incorrect changes to HTML to 'fix' things that used to not work; for example, Tidy will unexpectedly move a bullet list out of a table caption even though that's allowed. HTML4 Tidy is no longer maintained or packaged. There have also been a number of bug reports filed against Tidy [3]. Since Parsoid is based on HTML5 semantics, there are differences in rendering between Parsoid's rendering of a page and current read view that is based on Tidy. 3. Project status ----------------- Given all these considerations, the Parsing team started work to replace Tidy [4] around mid-2015. Tim Starling started this work and after a survey of existing options, decided to write a wrapper over a Java-based HTML5 parser. At the time we started the project, we thought we could probably have Tidy replaced by mid-2016. Alas! 4. What is replacing Tidy? -------------------------- Tidy will be replaced by a RemexHTML-based solution that uses the RemexHTML[5] library along with some Tidy-compatibility shims to ensure better parity with the current rendering. RemexHTML is a PHP library that Tim wrote with C.Scott’s input that implements the HTML5 parsing spec. 5. Testing and followup ----------------------- We knew that some pages will be affected and need fixing due to this change. In order to more precisely identify what that would be, we wanted to do some thorough testing. So, we built some new tools [6][7] and overhauled and upgraded other test infrastructure [8][9] to let us evaluate the impacts of replacing Tidy (among other such things in the future) which can be a subject of a post all on its own. You can find the details of our testing on the wiki [1][10], but we found that a large number of pages had rendering differences. We analyzed the results and categorized the source of differences. Based on that, to ease the process of replacement, we added a bunch of compatibility shims to mimic what Tidy does. I am skipping the details in this post. Even after that, newer testing showed that this nevertheless still leaves us with a few patterns that need fixing that we cannot / don't want to work around automatically. 6. Tools to assist editors: Linter & ParserMigration ---------------------------------------------------- In October 2016, at the parsing team offsite, Kunal ([[User:Legoktm (WMF)]]) dusted off the stalled wikitext linting project [11] and (with the help from a bunch of people on the Parsoid, db/security/code review areas) built the Linter extension that surfaces wikitext errors that Parsoid knows about to let editors fix them. Earlier this year, we decided to use Linter in service of Tidy replacement. Based on our earlier testing results, we have added a set of high-priority linter categories that identifies specific wikitext markup patterns on wiki pages that need to be fixed [12]. Separately, Tim built the ParserMigration extension to let editors evaluate their fixes to pages [13]. You can enable this in your editing preferences or replace '&action=edit' in your url bar with '&action=parsermigration-edit'. 7. What editors have to do -------------------------- The part that you have all been waiting for! Please see 'What editors will need to do' section on the Tidy replacement FAQ [1]. We have added simplified instructions, so that even community members who do not consider themselves "techies" can still learn about ways to fix pages. We'll keep that section up to date based on feedback and questions. But since it is a wiki, please also edit and tweak as required to make the text useful for yourselves! This is a first call for fixes and it is about the problems defined as "high priority". We'll issue other calls in the future for any other necessary Tidy fixups. Caveats: * As noted on that page, the linter categories don't cover all the possible sources of rendering differences. For example, there is still T157418 [14] left to address. For those who have an opinion about this, please chime in on that task. We are still evaluating the best solution for this without adding more cruft to wikitext behavior or kicking the cleanup can down the road. * As the issues in the identified linter categories are fixed, we might be better able to isolate other issues that need addressing. 8. So, when will Tidy actually be replaced? ------------------------------------------- We really would like to get Tidy removed from the cluster latest by June 2018 (or sooner if possible), and your assistance and prompt attention to these markup issues would be very helpful. We will do this in a phased manner on different wikis rather than all at once on all wikis. We really want to do this as smoothly as possible without disrupting the work of editors or affecting the rendering of the large corpus of pages on the various wikis. As you might have gathered from the text above, we have built and leveraged a wide variety of tools to assist with this. 9. Monitoring progress ---------------------- In order to monitor progress, we plan to do a weekly (or some such periodic frequency) test run that compares the rendering of pages with Tidy and with RemexHTML on a large sample of pages (in the 50K range) from a large subset of Wikimedia wikis (~50 or so). This will give us a pulse of how fixups are going, and when we might be able to flip the switch on different wikis. Subramanya (Subbu) Sastry Parsing Team. References ---------- 0. https://www.mediawiki.org/wiki/Talk:Parsing/Replacing_Tidy/FAQ 1. https://www.mediawiki.org/wiki/Parsing/Replacing_Tidy/FAQ#What_will_editors… 2. https://en.wikipedia.org/wiki/HTML_Tidy 3. https://phabricator.wikimedia.org/tag/tidy/ 4. https://phabricator.wikimedia.org/T89331 5. https://github.com/wikimedia/mediawiki-libs-RemexHtml 6. https://phabricator.wikimedia.org/T120345 7. https://github.com/wikimedia/integration-uprightdiff 8. https://github.com/wikimedia/integration-visualdiff 9. https://github.com/wikimedia/mediawiki-services-parsoid-testreduce 10. https://www.mediawiki.org/wiki/Parsing/Replacing_Tidy 11. https://phabricator.wikimedia.org/T48705 12. https://www.mediawiki.org/wiki/Help:Extension:Linter#Goal:_Replacing_Tidy 13. https://www.mediawiki.org/wiki/Help:Extension:Linter#Verifying_fixes_for_th… 14. https://phabricator.wikimedia.org/T157418

8 31

[RfC] [ORES] Do you use reverted model for wikis that have damaging ones?
by Amir Tafreshi 03 Aug '17

03 Aug '17

Hey, For patrolling work, ORES usually has two levels of support: - For basic support we usually provide a model that is called 'reverted' and has less accuracy. It also risks perpetuating editor biases due to the lack of differentiation between reasons that a change may have been reverted. - For advanced support, we require manual labeling of a large sample of edits, but then we can provide two more models: 'damaging' and 'goodfaith'. ORES review tool can only be enabled on wikis where advanced support is available and most other tools prefer the 'damaging' over the basic 'reverted' model as well. So, for performance and capacity reasons we think it makes sense to remove 'reverted' models from the ORES service when 'damaging' model is made available. However, we also want to be careful about making sure this change doesn't disrupt the work of tool developers that make use of the ORES service. If you do, please voice your concerns now. If there is no objection within the next two weeks, we'll begin the process of removing the 'reverted' model for wikis that have the 'damaging' model available. Related phabricator card: https://phabricator.wikimedia.org/T171059 Best -- Amir Sarabadani Tafreshi Software Engineer (contractor) ------------------------------------- Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

1 1

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l July 2017