On 21/09/12 21:37, Alex Brollo wrote:
I too use sometimes "large" switches (some hundred) and I'm far from happy about. For larger switches, I use nested switches, but I find very difficult to compare performance of nested switches (i.e.: a 1000 elements switch can be nested in three switches of 10 elements) against single global switches. I imagine that there's a "performance function" changing the number of switch level and number of switch elements, but I presume that it would be difficult to calculate; can someone explore the matter by tests?
I suppose a nested switch like:
{{#switch: {{{1}}} | 0 = {{#switch: {{{2}}} | 0 = zero | 1 = one }} | 1 = {{#switch: {{{2}}} | 0 = two | 1 = three }} }}
might give you a performance advantage over one of the form:
{{#switch: {{{1}}}{{{2}}} | 00 = zero | 01 = one | 10 = two | 11 = three }}
But it has no significant impact on memory usage, which was the subject of my initial post, and the performance advantage would have to compete with the overhead of using padleft etc. to split the input arguments.
To get a memory usage advantage, you have to split the templates up into smaller data items, like what is done for the English Wikipedia country data, e.g.:
https://en.wikipedia.org/w/index.php?title=Template:Flag&action=edit
http://en.wikipedia.org/w/index.php?title=Template:Country_data_Canada
But it is a time/memory tradeoff. We saw short Olympics articles with rendering times in the tens of seconds due to heavy use of these flag templates. There is a time overhead to loading each template from the database.
Another way would be, to implement a .split() function to transform a string into a list, at least; much better, to implement a JSON parsing of a JSON string, to get lists and dictionaries from strings saved into pages. I guess a dramatic improvement of performance; but I'm far from sure about.
I'm not sure how that would help. It sounds like you are describing the existing #switch except with a different syntax. Once you're finished parsing the JSON, you presumably have to store the lists and dictionaries in memory for use by the calling templates, and then you would have a similar memory usage to the Lua solution I discussed.
One extraordinary thing about these enormous data templates on the French Wikipedia is that they are not especially slow. The existing optimisations within the wikitext parser seem to work pretty well. We convert the #switch to XML and cache it in memcached, then for subsequent parse operations, it's a fast native XML parse operation followed by a tree traversal. We're seeing hundreds of megabytes of memory usage in 5-10 seconds of rendering time. If the templates were "nested" as you suggest, it would be even faster.
-- Tim Starling