Dagbani Wikipedia will be our first wiki for Wikifunctions integration

As we wrap up this quarter’s work and begin planning for the next, I want to discuss the progress of our biggest initiative this fiscal year: integrating Wikifunctions with Wikipedia articles. Our focus languages for the project are Bangla, Igbo, Hausa, Dagbani, and Malayalam. We have chosen Dagbani as the first Wikipedia for this integration.

This Quarter, we have focused on building a design prototype, consulting with internal teams to make key decisions, and drawing valuable insights from Wikidata’s initial integration experience. Now, we're ready to apply these learnings to Dagbani Wikipedia, creating a design that not only fits seamlessly with this wiki but will also scale effectively to larger ones in the future.

Involving user groups in shaping such a new venture/feature is essential for gaining insights into users' needs and pain points. Your feedback ensures that the product meets real expectations, enhancing usability and relevance. Early input helps identify issues before launch, reducing costly redesigns. Ultimately, involving users fosters ownership, driving product adoption and success.

We want to adopt this product philosophy in our integration work. We will be reaching out to our Dagbani Wikipedia community in the coming weeks to form a working group that can help us deliver this project to the Dagbani Wikipedia in a meaningful way. We want to form a diverse team of new editors, experienced editors and passionate readers. We are aiming for a small group of 3–5 people. Our idea is to meet with this group regularly, involving them in design prototype reviews, local demo reviews, exchanging product ideas and vision, and in turn building our confidence in the usefulness and readiness of our solutions.

Site reliability issues

As many of you will know, we've been having some stability challenges with the site for the past few days, in part caused by a surge of Web crawler traffic overloading the servers set aside for running Wikifunctions. This has taken the form of several issues, including the whole site appearing down (T374318), or intermittently breaking on pages that work sometimes (T374305 and T374241). We have put in place a few mitigations to try to reduce the load spent on non-human users. This has included temporarily banning Anthropic's ClaudeBot via robots.txt, and replacing the standard, Wikimedia-wide site reliability monitoring suite with a custom, more relevant, simpler one with less load (T374442). However, these have so far had limited effect, and we continue to review and try to improve the situation. Our apologies for the disruptions.

We are also noticing at the same time a novel issue with validation, and are currently simplifying validation workflows. This might lead to issues as objects might be unvalidated. Please let us know if you see weird new errors, particularly missing error messages where they should be.

Recent Changes in the software

Disconnected from the above site issues, we were alerted to a bug in last week's code that meant you couldn't select instances of Types in the selector; we made a quick fix for this, with a test to avoid future regressions, and back-ported it into production on Monday (T374199). We are thankful for GrounderUK and other community members who noticed this, and sorry for the disruption.

As an additional breakage, all of our end-to-end API testing that uses the Beta Cluster unfortunately broke last week, so we have temporarily disabled those tests and are now relying on manual testing alone (T374242).

One of the big parts of our Quarterly work is preparing for the "Wikipedia integration", in which you will be able to embed Wikifunctions call results in wikitext (T261472). We landed some improvements there, in particular changes to separate the concerns between the 'client' code, running on Wikipedias, and the 'repo' code, running on Wikifunctions.org. More of this work should land soon, including with a demonstration.

Another part of our Quarterly work is preparing for being able to reference Wikidata items in Function calls (T282926). We've made some changes to our conceptual model of references, which we expect to be a temporary work-around for the next few months so that Wikidata references can be used ahead of any wider Type calculus reforms, which mean that our code to check if something is a reference will, at least for now, stop recognising the forms "Q1234" or "L1234", and only "Z1234" or "Z1234K1" (T373859). The back-end code to access these and formalise them into Types (T370072) continues, and we hope to demonstrate it soon.

We made a handful of UX improvements that go out this week. When making changes via the About control (rather than the whole-page editing flow), we fixed the "Cancel" button in the publish dialog to return you to the editor, rather than throw away all your changes – sorry for that (T360062). When a Z6/String value is very long, we now ask your browser to wrap the text rather than have it overflow (T373987). We fixed the width of the Function editor when on narrow screens (below 500px), such as mobile phones (T366675). We updated the object selector to be smarter about the restrictions on what Functions to search for in particular contexts where we know the 'shape' expected (T372995).

We also made some general technical improvements. We have replaced our old temporary Tooltip component with Codex's proper one, now that it exists (T298040). This nearly completes our replacement with upstream components. We have just the Table used on Function pages to list Implementations and Test cases to go (T373197). We're hugely thankful to the Design System team for their work developing the Codex library to the point where our ad hoc versions are no longer needed. As part of our long-running migration from strings to references for Z61/Programming language objects (T287153), we have completed the dropping of support for them in the UX layer. All existing content on Wikifunctions.org was migrated back in May/June, so this should not have any disruptive effect.

We, along with all Wikimedia-deployed code, are now using the latest version of the Codex UX library, v1.12.0, as of this week. We found one change that broke how we were using the "lookup" component, which we have worked around (T374248) ahead of an upstream fix (T374246); we believe that there should be no further user-visible changes on Wikifunctions, so please comment on the Project chat or file a Phabricator task if you spot an issue.

Function of the Week: count substrings

Recently, LLMs made a small newscycle because they failed at the question “how often does the letter ‘r’ appear in ‘strawberry’?” (You can easily find coverage about this on various fora and news sites.)

Wikifunctions has no such problem: with the function “count substrings” (count substrings (Z14450)) we can easily ask how often the substring ‘r’ appears in the string ‘strawberry’, and, unsurprisingly, it returns 3.

The function has two implementations, one in JavaScript and one in Python:

The JavaScript implementation is a good example of a seemingly simple functionality with a surprisingly complex implementation. And yet, the suggested implementation is suspectible to errors. Since the second argument gets turned into a regular expression, some symbols mess up the search. I added a test for that case, the second to last in the list below. Fortunately, only the Python implementation is connected, therefore the error in the JavaScript implementation is actually not used -- the approval system worked as intended.

The function offers six tests:

Since I added the last two tests while writing this entry, they are not connected yet.

The third test here is particularly interesting, showing how such a seemingly simple function can have very different interpretations: one could argue that “aaaaa” has the string “aa” four times, at positions 1, 2, 3, and 4, but the function counts in a so-called greedy way: “aa” only fits twice into “aaaaa”. This is why tests are so important, to show and agree on the exact meaning of the function.

It would be great to see more tests with other scripts, such as for example Arabic or Chinese. It is nice to see Cyrillic represented in one of the tests.

It is not surprising that current LLMs struggle with this question, due to the way they work. We strictly believe that a good future architecture for a question answering machine doesn’t only use the model itself, but also a large document store, such as the Web, a knowledge base, such as Wikidata, and a repository of functions, such as Wikifunctions. Any one of these drastically expands what kind of questions the system will be able to answer with high accuracy and confidence.