Does this throughput rate match what you see on the database?
It's hard to tell, because I can't infer how many events per day are generated based on what you're showing me. It depends on the granularity of the curve's rendering. Even if I knew what the numbers going up to 6.2 on the left represent, it still wouldn't tell me how many per hour/per day are recorded there.
Are there a bigger number of events that might not be validating for other wikis versus China?
Are you talking about syntactically incorrect events that don't pass the schema check?
Wouldn't this happen if your product is "more used" in china than elsewhere?
Sure, but that doesn't really matter, because global trends would evolve slowly. The point of tracking the network performance statistics worldwide, despite their very uneven distribution, is to immediately spot when something very wrong happens. We also track the same stats per wiki, and per country (albeit not on a timeline at this point for the per-country ones). The bottom line is that the per-country share of recorded EL events shouldn't vary wildly like that.
Logs are at: stat1002:/a/eventlogging/
gilles@stat1003:~/20150120$ cat all-events.log-20150120 | grep "\"clientValidated\": true" | grep -c "\"schema\": \"MultimediaViewerNetworkPerformance\""
330061
And in the DB across the tables that have "active" versions, I get a total of 31 + 327867 = 327898 events recorded
A difference of 2000ish events, might not be a real difference since the date cutoff probably differs between the log file and the timestamp contained in its entries.
Now, looking at an earlier date, when things seemed to be stable in terms of country-based sampling:
gilles@stat1003:~/20150103$ cat all-events.log-20150103 | grep "\"clientValidated\": true" | grep -c "\"schema\": \"MultimediaViewerNetworkPerformance\""
287712
And in the DB, a total of 5 + 289461 = 289466
So far it does look like everything is making it to the DB, I'll keep investigating tomorrow.