Erik, Trey, David, Kevin, and I met this morning to discuss how we're going to handle data collection for the upcoming TextCat test [1]. A big problem in this particular case is that the system wasn't designed/engineered in a way that's conducive for cross-wiki logging / session tracking. And recently we even lost the ability to use the referrer info to see which page the user came from when visiting another wiki page when going between wikis. (I was told this was done for user privacy reasons.)
Erik said he had recently implemented a click event in the TestSearchSatisfaction2 schema that we might be able to hook into to measure clickthrough rate for users who are eligible for TextCat language detection & get shown results in the language their non-English query probably is written in. Whether we use this and how much we rely on this particular method of measuring whether TextCat is successful (beyond just measuring how it impacts the zero results rate) depends on the validation [2] of the click events and how they compare to page visit events (which cannot be fired in an interwiki context).
We also discussed an alternative approach which uses web requests with the caveat being that if a user is selected for the test once, they'll be selected every time. So if a particular IP+UA combination is part of the test and performs 2 million searches (as is sometimes the case), then we'll have to do some very careful filtering which will also exclude some completely valid use cases (a computer lab in a school or a country with only 2 public IP addresses). But we're shooting for being able to use TestSearchSatisfaction2 :)
[1] https://phabricator.wikimedia.org/T121542 [2] https://phabricator.wikimedia.org/T132706