On Tue, Dec 8, 2015 at 9:35 AM, David Causse dcausse@wikimedia.org wrote:
• I think it would make sense to set this up as something that can keep the models in memory. I don't know enough about our PHP architecture to know if you can init a plugin and then keep it in memory for the duration. Seems plausible though. A service of some sort (doesn't have to be Perl-based) would also work. We need to think through the architectural bits.
Yes that was the purpose of my question concerning init time overhead. LM files seem to be already ordered so it would be 15 tsv files of ~3kb to read on each query. If we rewrite it in PHP we could maybe write the profiles as a PHP script directly (should be pretty small compared to the 500kb of InitialiaseSettings.php). But I'm no expert here.
Exporting a static php array (php.net/var_export) into a .php file will be our best bet for performance. HHVM has an optimization that takes advantage of the copy on write semantics. It will create a single read only instance in memory shared between all execution threads as long as the array is completely static (after constant folding and such). This basically means there will be no parsing/loading it just exists in memory ready to use.