I done some testing of the performance of Lua-based templates as
deployed on enwiki. This analysis is summarized at:
https://en.wikipedia.org/wiki/User:Dragons_flight/Lua_performance
The bottom line is that Lua is fast, and often much faster than the
template coding it replaces.
For the important case of citation templates, one can anticipate
seeing about an 80% reduction in render time once Module:Citation/CS1
is deployed. This will have the effect that 300 citations can be
processed in about 3.5 seconds rather than 18 seconds. Such an
improvement should make a meaningful difference for many of
Wikipedia's complex pages.
One unexpected detail that came out of my testing is that the overhead
per #invoke call is about 4.5 milliseconds, which is actually fairly
large once one starts talking about having several hundred calls on a
single page. For the citation module, this overhead is about 40% of
the run time. For some of the simpler number formatting and string
manipulation Lua modules, the overhead can be 75-90% of the run time.
I don't know if it is possible, but it may be worth looking to see if
there are ways to use caching or other techniques to reduce the
overhead associated with launching each #invoke instance.
-Robert Rohde