Yet it shouldn't be too hard to notice a 20 % slowdown with small usability tests/focus groups. It could be interesting to test a couple existing skins and a couple big interface changes in the works (such as Special:RecentChanges and Special:Search) to see if there is any such big gap anywhere.
For the case of Recent Changes
a before/after comparison does not seem to suggest that the changes involved going flat. In the previous state the filtering UI was a box with a flat lists of links and text, while the new UI uses contrast and grouping to help users identify the different elements.
If there is any particular aspect related to flatness that anyone thinks we need to pay special attention to, feel free to share it and we can incorporate it in future research. We have been doing different rounds of research to test
initial concepts,
iterated ideas and
the version available on beta. The results suggest that users are able to identify more clearly which is the current state of the filters and how to manipulate them with the new approach.
In general, I think that labels such as "flat design" combine several different aspects that makes it hard to make broad statements like flat design being good or bad for all contexts. Talking about the impact on choices for the clarity of affordances, contrast of elements, layout approaches, etc. makes more sense to me. For example,
the Nielsen/Norman article criticizes both skeumorphism (for resulting in "clunky interfaces") and flat design (for the loss of clickability signifiers), but recommends what they call "flat design 2.0" for incorporating signifiers based on our intuition of phisics as Google's material does:
Early pseudo-3D GUIs and Steve-Jobs-esque skeuomorphism often produced heavy, clunky interfaces. Scaling back from those excesses is good for usability. But removing visual distinctions to produce fully flat designs with no signifiers can be an equally bad extreme. Flat 2.0 provides an opportunity for compromise — visual simplicity without sacrificing signifiers.