We had to evaluate data formats for event streaming systems as part of WMF's Modern Event Platform program
. The event streaming world mostly uses Avro, but there is plenty of Protobufs around too.
We ultimately decided to use JSON with JSONSchema as our transport format. While lacking some advantages of the other binary options, JSON is just more ubiquitous and easier to work with in a distributed and open source focused developer community. (You don't need the schema to read the data.)
Our choice of JSONSchema and JSON is mostly around canonical data schemas for in-flight data transport and protocols. For data at rest, it might make more sense to serialize into something completely different (we use Parquet in Hadoop for most data there). You can read some WIP documentation about how we use JSONSchema here