Hi,
We're currently in the process of upgrading the MediaWiki servers to Debian Buster and expect a performance regression to come with it.
The cause appears to be better Spectre[1] mitigations in the Buster 4.19 kernel, which we can't disable. Most of the effect is seen in code that ends up invoking syscalls like filemtime, file_get_contents, etc.
I posted some numbers and charts on the Phabricator investigation ticket[2]. For normal requests it looks like ~5% worse for p50/p75 and around ~13% for p95/p99. API requests look much worse, at 10% for p50 22% for p75.
What now? We're going to continue with the upgrade as planned, but we also need help to try and make some performance improvements to reduce the impact of the regression.
The PHP profiling flamegraphs[3] are a great tool to use to identify potentially slow spots. We now also have flamegraphs that only contain Buster requests. I created a set of differential flamegraphs[4] that compare Stretch vs Buster so you can see what specific areas slowed down.
You can also use WikimediaDebug/XHGui[5] to profile a specific request. mwdebug1001/mwdebug1002 are Stretch and mwdebug1003 is Buster.
If you have questions or suggestions please ask or let us know. Thanks to everyone who helped with the investigation and those who've started working on improvements already.
[1] https://en.wikipedia.org/wiki/Spectre_(security_vulnerability) [2] https://phabricator.wikimedia.org/T273312#6802330 [3] https://performance.wikimedia.org/php-profiling/ [4] https://people.wikimedia.org/~legoktm/T273312/data/clean/images/flamegraphs/ [5] https://wikitech.wikimedia.org/wiki/WikimediaDebug#Request_profiling
-- Kunal