Howdy, Here's the weekly update from the Search Platform team.
As always, feedback and questions welcome.
== Discussions ==
=== Search === * After lots of talk about stemmers getting committed and plugins getting deployed, the Slovak-language wikis have finally been *reindexed*, and stemming [0] is now happening on the Slovak wikis! [1]
=== Search—Time Machine Edition === A few things from May that got missed:
* Trey wrote up some potential applications of natural language processing (NLP) to on-wiki search [2]. We're still going through them to pick out a couple that we'll turn into projects, probably next quarter. Right now, spelling correction and entity extraction are high on the list, but more questions, comments, and suggestions are welcome. * Erik pulled 90 days worth of regular expression (regex) searches across all wikis, and Trey did a quick survey of the most common patterns. [3] There are a lot more regex searches than we thought—5.6 million in 90 days!—and three apparently automated processes (bots, apps, or tools of some kind) are responsible for more than 90% of the regex searches.
[0] https://en.wikipedia.org/wiki/Stemming [1] https://phabricator.wikimedia.org/T190815 [2] https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Potential_Application... [3] https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Survey_of_Regular_Exp... ---
Subscribe to receive on-wiki (or opt-in email) notifications of the Discovery weekly update.
https://www.mediawiki.org/wiki/Newsletter:Discovery_Weekly
The archive of all past updates can be found on MediaWiki.org:
https://www.mediawiki.org/wiki/Discovery/Status_updates
Interested in getting involved? See tasks marked as "Easy" or "Volunteer needed" in Phabricator.
[1] https://phabricator.wikimedia.org/maniphest/query/qW51XhCCd8.7/#R [2] https://phabricator.wikimedia.org/maniphest/query/5KEPuEJh9TPS/#R
Yours, Chris Koerner Community Liaison Wikimedia Foundation