Hello,
Here are this past week's updates from the Discovery department.
== Highlights ==
* Finalized the second BM25 testing analysis and linked to the pdf here. [0]
==Search ==
* Migrated Phan for CirrusSearch to Jenkins. (technical debt) [1] [2]
* Finished writing up, summarizing, and recommending extensive changes to TextCat for language identification. [3] Overall improvement to F0.5 accuracy was a mean of just under 5% across the corpora from nine Wikipedias. The two worst performing corpora, from enwiki and nlwiki, each went up around 10%! All nine are now above 90% F0.5 score. Next step is to deploy the recommended changes. [4]
* Completed (a round of) refactoring and cleanup of Special:Search code [5] [6]
----
The archive of all past updates can be found on MediaWiki.org:
Interested in getting involved? See tasks marked as "Easy" or "Volunteer needed" in Phabricator.
Yours,
Chris Koerner
Community Liaison - Discovery
Wikimedia Foundation