It's more getting the configuration correct (CommonSettings, InitializeSettings, database configuration, etc.) working in beta labs in order to discover any glitches that might occur before doing these updates in production. test2wiki is of particular concern because it is a peer node on the production cluster, it shares configuration with every other node on the Wikipedia cluster. Making a mistake in test2wiki can have serious consequences, better to make any mistake in beta labs first.
Beyond that, I'd really like the ability to set up and tear down multiple Flow pages with interesting content other than just http://en.wikipedia.beta.wmflabs.org/wiki/Talk:Flow_QA for testing. (And in doing so, encounter any issues along those lines.)