On 21.09.2012, 11:47 Strainu wrote:
2012/9/21 Tim Starling tstarling@wikimedia.org:
On 21/09/12 16:06, Strainu wrote:
I'm just curious: would LUA improve memory usages in this use case?
Yes, it's an interesting question.
I tried converting that template with 37000 switch cases to a Lua array. Lua used 6.5MB for the chunk and then another 2.4MB to execute it, so 8.9MB in total compared to 47MB for wikitext. So it's an improvement, but we limit Lua memory to 50MB and you would hit that limit long before you loaded 15 such arrays.
I'm not sure on how the Lua code would look like, but perhaps you can tweak the loading of Lua templates so that you don't load the same code more than once? I'm totally oblivious on how MediaWiki (or is it PHP?) is linked to Lua right now, but I'm thinking along the lines of a C program which loads a library once, then can use it many times over.
And what if a page is related to France, Germany and other European countries at once? Loading this information just once isn't helpful - it needs to load just what is needed, otherwise smart wikipedians will keep inventing creative ways to push the boundaries:)
With such an approach, you would have 6.5 + 15*2.4 = 42.5 MB of memory (assuming memory cannot be reused between calls).
It's still an O(N) solution. What we really want is to avoid loading the entire French census into memory every time someone wants to read an article about France.
Well, you said something about Wikidata. But even if the client Wiki would not need to load the full census, can it be avoided on Wikidata?
(Mumbles something about databases that don't store all information in one row and don't always read all the rows at once)