[discovery] Ideas around a public release of ML training set for search