I think the problem is that the current setup is too slow to render low zooms, with single metatiles taking partly over a minute to render. This is therefore clearly not feasible to render on the fly. However, with the 200+ styles of cassini, the fraction of low zoom tiles to high zoom tiles is skewed in the wrong direction. Once you get down to zoom level 12 or so, it starts just about being fast enough to render on the fly.
Maybe adding an index on place and capital would help already.. How are the indexes on the main osm tile server?