Wikitech-l

wikitech-l@lists.wikimedia.org

4 participants
23058 discussions

by Martin Domdey

Hi, I want to look in table comment column comment-text for %Unsinn% case-insensitive with this clause: where comment_text collate utf8mb4_general_ci like '%Unsinn%'; but this gives me error: COLLATION 'utf8mb4_general_ci' is not valid for CHARACTER SET 'binary' How can I proceed such a collation? Kind regards Doc Taxon ...

12 hours, 52 minutes

Phabricator monthly statistics - 2024-05

by aklapper＠wikimedia.org

Hi Community Metrics team, This is your automatic monthly Phabricator statistics mail. Accounts created in (2024-05): 329 Active Maniphest users (any activity) in (2024-05): 1196 Task authors in (2024-05): 601 Users who have closed tasks in (2024-05): 322 Projects which had at least one task moved from one column to another on their workboard in (2024-05): 325 Tasks created in (2024-05): 2517 Tasks closed in (2024-05): 2732 Open and stalled tasks in total: 53662 * Only open tasks in total: 52658 * Only stalled tasks in total: 1004 Median age in days of open tasks by priority: Unbreak now: 3 Needs Triage: 1010 High: 1197 Normal: 2080 Low: 2606 Lowest: 3086 (How long tasks have been open, not how long they have had that priority) To see the names of the most active task authors: * Go to https://wikimedia.biterg.io/ * Choose "Phabricator > Overview" from the top bar * Adjust the time frame in the upper right corner to your needs * See the author names in the "Submitters" panel TODO: Numbers which refer to closed tasks might not be correct, as described in https://phabricator.wikimedia.org/T1003 . Yours sincerely, Fab Rick Aytor (via community_metrics.sh on phab1004 at Sat 01 Jun 2024 12:00:37 AM UTC)

1 day, 7 hours

(Possible breaking change) XML pages-articles dumps bug with missing revision text for some records; fix in progress with schema change

by Adam Baso

As described on Phabricator a bug [1] surfaced whereby the "pages-articles" XML dumps on https://dumps.wikimedia.org/ bear incomplete records. A possible fix has been identified, and it involves bumping the dump schema version from version 0.10 to version 0.11 [2], which could be a breaking change for some. MORE DETAILS: Due to the bug that surfaced, a nontrivial number of <text> nodes representing article text shows in a fashion like so as empty. <text bytes="123456789" /> A potential fix in T365155 [3] has been identified. Assuming further testing looks good, XML dumps will be kicked off again starting next week in order to restore the missing records as soon as possible. It will take a while for new dumps to be generated as it is a compute intensive operation. More progress will be reported at T365155 and new dumps will eventually show up on dumps.wikimedia.org . Although a number of pipelines may not notice the change associated with the schema bump, if your dump ingestion tooling or use of Special:Export relies on the specific shape of the XML at version 0.10 (e.g., because of code generation tools), please examine the differences between version 0.10 and version 0.11. One notable addition in version 0.11 is addition of MCR [4] fields. Thank you for your patience while this issue is resolved. -Adam [1] https://phabricator.wikimedia.org/T365501 [2] https://www.mediawiki.org/xml/export-0.10.xsd and https://www.mediawiki.org/xml/export-0.11.xsd Schema version 0.11 has existed in MediaWiki for over 6 years, but Wikimedia wikis have been using version 0.10. [3] https://phabricator.wikimedia.org/T365155#9851025 and https://phabricator.wikimedia.org/T365155#9851160 [4] https://www.mediawiki.org/wiki/Multi-Content_Revisions

1 day, 10 hours

by Srishti Sethi

Hello all, The next language community meeting is scheduled in a few weeks - May 31st at 16:00 UTC. If you're interested, you can sign up on this wiki page: < https://www.mediawiki.org/w/index.php?title=Wikimedia_Language_engineering/… >. This is a participant-driven meeting, where we share language-specific updates related to various projects, collectively discuss technical issues related to language wikis, and work together to find possible solutions. For example, in the last meeting, the topics included the machine translation service (MinT) and the languages and models it currently supports, localization efforts from the Kiwix team, and technical challenges with numerical sorting in files used on Bengali Wikisource. Do you have any ideas for topics to share technical updates related to your project? Any problems that you would like to bring for discussion during the meeting? Do you need interpretation support from English to another language? Please reach out to me at ssethi(a)wikimedia.org and add agenda items to the document here: < https://etherpad.wikimedia.org/p/language-community-meeting-may-2024>. We look forward to your participation! Cheers, Jon, Mary, Oscar, Amir and Srishti *Srishti Sethi* Senior Developer Advocate Wikimedia Foundation <https://wikimediafoundation.org/>

1 day, 16 hours

🌱 Fresh 24.05 released!

by Krinkle

TLDR: fresh-node now defaults to Node.js 20, and introducing the "fresh-npm" security feature. Get started: https://gerrit.wikimedia.org/g/fresh#fresh-environment Changelog: https://gerrit.wikimedia.org/g/fresh/+/HEAD/CHANGELOG.md Commits: https://gerrit.wikimedia.org/r/q/project:fresh+is:merged Hi all, Fresh 24.05 is upon us! *What's new?* The fresh-node22 command has been introduced by James Forrester, and is now open for early testing. This uses the "releng/node22-test-browser" Docker image that is also available to Jenkins jobs in WMF CI. Standalone libraries and tools are welcome opt-in and switch their CI jobs in Zuul config if they pass under node22. The default fresh-node command was updated from Node.js 18 to Node.js 20, similarly re-using the same Docker images that we use in WMF CI. These feature the same Debian Linux version, same pre-installed packages, and versions thereof. This makes it as easy as possible to reproduce CI failures locally. Vice versa, if you use Fresh in local development, you're unlikely to encounter failures in CI. You can continue to develop on older versions via the fresh-node18 and fresh-node16 commands. The fresh-node14 command has been removed (unsupported since last year <https://github.com/nodejs/Release#end-of-life-releases>). This release includes the first contribution to Fresh by Marius Hoch (WMDE), who fixed a bug <https://gerrit.wikimedia.org/r/c/fresh/+/1034847> affecting projects with a space in their working directory name. Thanks Marius! Finally, this release introduces the experimental "fresh-npm" feature. You can opt-in by cloning the repo and running `bin/fresh-install --secure-npm`. This will shadow the npm command in the shell on your main workstation, and avoids accidentally running potentially insecure scripts outside Fresh. Other npm commands are unaffected. It can be bypassed as-needed by specifying the full path to npm, which is also printed at the end of any fresh-npm help or error message. I previously maintained this under the name "secpm" in a local patch <https://gerrit.wikimedia.org/r/c/fresh/+/675346> since 2021. It has served myself and a handful of others well. I hope it can be useful to others! To report issues or browse tasks, find us on Phabricator at https://phabricator.wikimedia.org/tag/fresh/. *What is Fresh?* Fresh is a fast way to launch isolated environments from your terminal. These can be used to work more securely and responsibly <https://timotijhof.net/posts/2019/protect-yourself-from-npm/> with Node.js-based developer tools, especially those installed from npm such as ESLint, QUnit, Grunt, Webdriver, and more. Example guide: https://www.mediawiki.org/wiki/Manual:JavaScript_unit_testing. Get started https://gerrit.wikimedia.org/g/fresh#fresh-environment -- Timo Tijhof, Principal Engineer, Wikimedia Foundation.

2 days, 8 hours

Gerrit upgrade Monday June 3rd at 8am UTC

by Antoine Musso

Hello, I will be *upgrading Gerrit* from the 3.8 series to the 3.9 series. I have scheduled the upgrade for *Monday June 3rd at 8am UTC*. It is immediately after the UTC morning backport & config window. The upgrade requires the Gerrit service to be stopped for the duration of the upgrade. Given we do not need to reindex all the changes, the downtime should be just a few minutes. Gerrit 3.9 brings: * Support for diff3 rendering for changes having conflict markers (T359821 <https://phabricator.wikimedia.org/T359821>) which corresponds to cgit merge.conflictStyle=diff3. * User Suggested Edits <http://gerrit-documentation.storage.googleapis.com/Documentation/3.9.5/user…>, an easy way for reviewers to suggest code changes which can be easily applied by the change owner. One can imagine CI would be able to offers suggestions as well (such as phpcb from PHP CodeSniffer or eslint?). * Gerrit would now use "Revert^2" syntax when crafting a revert instead of chaining them as "Revert "Revert "Revert...". * And more UI changes <https://www.gerritcodereview.com/3.9.html#gerrit-ui-changes> The release notes: https://www.gerritcodereview.com/3.9.html <https://www.gerritcodereview.com/3.9.html>The upgrade task: https://phabricator.wikimedia.org/T354887 Deployment calendar entry <https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20240603T0800> Antoine "hashar" Musso Wikimedia Release Engineering

3 days, 12 hours

Can we do better than just redirect HTTP API requests to HTTPS?

by psnbaotg

I noticed an interesting post on Hacker News: https://news.ycombinator.com/item?id=40504756 (https://jviide.iki.fi/http-redirects) Basically, this article argues that for reasons, API should "fail early", such as returning with 403 and revoking all credentials sent via plain text, rather than redirecting all HTTP requests to HTTPS. In my humble opinion, this article's point make perfect sense. Because we cannot expect an arbitrary client to follow HSTS and a simple typo can cause serious credential leak. I found that all our APIs (action API, Wikimedia REST, and even Wikimedia Enterprise) are doing redirects: ``` $ curl -I "http://en.wikipedia.org/api/rest_v1/page/title/Earth" HTTP/1.1 301 Moved Permanently content-length: 0 location: https://en.wikipedia.org/api/rest_v1/page/title/Earth server: HAProxy x-cache: cp5023 int x-cache-status: int-tls connection: close $ curl -I "http://en.wikipedia.org/w/api.php?action=query&prop=info&titles=Earth" HTTP/1.1 301 Moved Permanently content-length: 0 location: https://en.wikipedia.org/w/api.php?action=query&prop=info&titles=Earth server: HAProxy x-cache: cp5023 int x-cache-status: int-tls connection: close $ curl -I http://api.enterprise.wikimedia.com/v2/snapshots HTTP/1.1 301 Moved Permanently Server: awselb/2.0 Date: Wed, 29 May 2024 10:03:24 GMT Content-Type: text/html Content-Length: 134 Connection: keep-alive Location: https://api.enterprise.wikimedia.com:443/v2/snapshots ``` I'm asking security folks, should we consider making above changes, like those services listed in the article? Thanks you. Best regards, diskdance

3 days, 15 hours

PrefixSearchBackend deprecation

by Derk-Jan Hartman

I've been trying to cleanup TitleKey a bit (hackathon project). One issue was the deprecation of PrefixSearchBackend, the associated hook and the class TitlePrefixSearch in 1.41. Apparently this was already meant to be deprecated since 1.27, but never really properly carried out. However the Core's searchengine itself still uses all this, and the only proper way to override it, is by reimplementing the completionSearchBackend method of your own searchengine backend, which means that you have to provide all other search functionality for via that alternative backend as well. This is also how CirrusSearch does this. However, for TitleKey, we essentially want to bolt this on top of an existing backend and that's not a simple job any longer. I'm now ending up with this: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/TitleKey/+/1036312 Three new subclasses for the three core searchengines that the system admin installing TitleKey has to choose from. Not really convenient. Issues I see with the deprecation: - There is no alternative hook to modify prefix search results. - Does it even make sense to have the class TitlePrefixSearch be deprecated (without replacement), if it is implementing the same logic for all core searchengine backend right now via the abstract SearchEngine class ? - Another nice option might be to have a TitlePrefixSearch to be a service, similar to TitleMatcher ? That would make it easier to replace the code - Should core replace TitlePrefixSearch with a search based on a near match (fuzzy match) with TitleMatcher service perhaps ? Additionally: - There is no core concept of nearmatch title matching in core search engines (cirrus search can do this, and it does so via the completionsearch, but that seems a bit of a hack). - There is also a deprecated StringPrefixSearch also deprecated since 1.27, but only officially since 1.41, which is used exclusively by Extension:MassEditRegex and that seems to be in the same boat. - There are more questions for the future and implementation of titlekey extension itself, but i think those are better left for another time. So my question is: Where do we want to take prefix/completionSearch in core. Do we want to address this and if so what are your suggestions ? DJ

5 days, 9 hours

[Wikitech-I] Call for Projects and Mentors for Google Summer of Code 2024 and Outreachy Round 28 is OPEN!

by Onyinyechi Onifade

Hello everyone, Wikimedia is gearing up to apply as a mentoring organization for Google Summer of Code 2024 < https://www.mediawiki.org/wiki/Google_Summer_of_Code/2024>[1] and Outreachy Round 28 <https://www.mediawiki.org/wiki/Outreachy/Round_28> [2]. Currently, we're crafting a list of exciting project ideas for the application. If you have any suggestions for projects, whether coding or non-coding (design, documentation, translation, outreach, research), please share them by February 5th via this Phabricator task: < https://phabricator.wikimedia.org/T354734> [3]. Note that for non-coding projects eligible for Outreachy, slots are limited and will be allocated to mentors on a first-come, first-serve basis. Timeline In your role as a mentor, your involvement spans the application period for both programs, taking place from March to April. During this time, you'll guide candidates in making small contributions to your project and address any project-related queries they may have. As the application period concludes, you'll further intensify your collaboration with accepted candidates throughout the coding period, which extends from May to August. Your support and guidance are crucial to their success in the program. Guidelines for Crafting Project Proposals: - Follow this task description template when you propose a project in Phabricator: < https://phabricator.wikimedia.org/tag/outreach-programs-projects> [4]. You can also use this workboard to pick an idea if you don't have one already. Add #Google- Summer-of-Code (2024) or #Outreachy (Round 28) tag. - Project should require an experienced developer ~15 days and a newcomer ~3 months to complete. - Each project should have at least two mentors, including one with a technical background. - Ideally, the project has no tight deadlines, a moderate learning curve, and fewer dependencies on Wikimedia's core infrastructure. Projects addressing the needs of a language community are most welcome. * Learn more about the roles and responsibilities of Mentors for both programs:* - Outreachy: <https://www.mediawiki.org/wiki/Outreachy/Mentors> [5] - Google Summer of Code: < https://www.mediawiki.org/wiki/Google_Summer_of_Code/Mentors> [6] Thank you, Links: [1] https://www.mediawiki.org/wiki/Google_Summer_of_Code/2024 [2] https://www.mediawiki.org/wiki/Outreachy/Round_28 [3] https://phabricator.wikimedia.org/T354734 [4] https://phabricator.wikimedia.org/tag/outreach-programs-projects [5] https://www.mediawiki.org/wiki/Outreachy/Mentors [6] https://www.mediawiki.org/wiki/Google_Summer_of_Code/Mentors -- *Onyinyechi Onifade * Technical Community Program Manager Wikimedia Foundation <https://wikimediafoundation.org/>

1 week

Re: [Wikitech-I] Call for Projects and Mentors for Google Summer of Code 2024 and Outreachy Round 28 is OPEN!

by Pranjal Rajput

Hello everyone, I am Pranjal Rajput(PR4NJ41), an undergraduate pursuing Bachelor of Technology(B.Tech) from Indian Institute of Technology (BHU) Varanasi. I would be contributing to improve the course copying feature across servers on Wikimedia Dashboard ( https://summerofcode.withgoogle.com/programs/2024/projects/t8CtR8IP) under GSOC'24. I would be working under the guidance of my mentor Sage Ross(ragesoss) & Shashwat(TheTrio). I am looking forward to contribute and also learn and grow simultaneously with the Wikimedia community. Regards, Pranjal

1 week

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l