Hashar noticed that for simple string literals, single quotes are faster than double quotes, if you count both parse time and execution time. This is reasonably easy to understand: single quotes have only two special characters, \ and ', so the loop body to process them would be simpler. Some people went on to assert that by extension $foo.' '.$bar is faster than "$foo $bar", without doing any benchmarks. I was never convinced of this.
I did a benchmark where I had one file which looked like this:
<?php $foo = 'foo'; $bar = 'bar'; echo " $foo $bar $foo $bar $foo $bar $foo $bar $foo $bar $foo $bar $foo $bar ... ?>
And one file that looked like this:
<?php $foo = 'foo'; $bar = 'bar'; echo $foo.' '.$bar.' '.$foo.' '.$bar.' '.$foo.' '.$bar.' '.$foo.' '.$bar. ... ?>
There were 10000 "foo bar" strings output by each one. On my laptop, with no bytecode caching, string interpolation was 12 times faster than concatenation.
Document Path: /test/concatenation.php Document Length: 80001 bytes
Concurrency Level: 1 Time taken for tests: 7.420670 seconds Complete requests: 10 Failed requests: 0 Write errors: 0 Total transferred: 801670 bytes HTML transferred: 800010 bytes Requests per second: 1.35 [#/sec] (mean) Time per request: 742.067 [ms] (mean) Time per request: 742.067 [ms] (mean, across all concurrent requests) Transfer rate: 105.38 [Kbytes/sec] received
Document Path: /test/string_interpolation.php Document Length: 82242 bytes
Concurrency Level: 1 Time taken for tests: 0.610879 seconds Complete requests: 10 Failed requests: 0 Write errors: 0 Total transferred: 824080 bytes HTML transferred: 822420 bytes Requests per second: 16.37 [#/sec] (mean) Time per request: 61.088 [ms] (mean) Time per request: 61.088 [ms] (mean, across all concurrent requests) Transfer rate: 1316.14 [Kbytes/sec] received
The slightly longer filesize in the string_interpolation.php case was because I used line breaks embedded in the string to break up powers of 10, whereas in concatenation.php the line breaks were not in the strings.
The main reason I'm doing this is because I think $foo.' '.$bar looks ugly compared to "$foo $bar", and it also takes up more screen space. I doubt the performance gain would be significant either way, concatenation takes only 19us. I just don't like seeing the MediaWiki codebase uglied up for specious reasons.
-- Tim Starling
Am Samstag, 29. Juli 2006 12:29 schrieb Tim Starling:
There were 10000 "foo bar" strings output by each one. On my laptop, with no bytecode caching, string interpolation was 12 times faster than concatenation.
I believe this is because you are concatenating a large number of variables together into one string, which is quite unrealistic for a real-world application. The more frequent case is where you have only a few variables:
foo = 'foo'; $bar = 'bar'; $start = microtime(TRUE);
for($j = 0; $j < 100; $j++) for($i = 0; $i < 100000; $i++) $s = $foo.' '.$bar; //$s = "$foo $bar";
$stop = microtime(TRUE); echo $stop-$start;
On my laptop this example needs about 14 secs with interpolation, but only 9-10 secs with concatenation.
Marc Schütz wrote:
Am Samstag, 29. Juli 2006 12:29 schrieb Tim Starling:
There were 10000 "foo bar" strings output by each one. On my laptop, with no bytecode caching, string interpolation was 12 times faster than concatenation.
I believe this is because you are concatenating a large number of variables together into one string, which is quite unrealistic for a real-world application. The more frequent case is where you have only a few variables:
foo = 'foo'; $bar = 'bar'; $start = microtime(TRUE);
for($j = 0; $j < 100; $j++) for($i = 0; $i < 100000; $i++) $s = $foo.' '.$bar; //$s = "$foo $bar";
$stop = microtime(TRUE); echo $stop-$start;
On my laptop this example needs about 14 secs with interpolation, but only 9-10 secs with concatenation.
That test measures execution time, not parse time. You're measuring 1us per iteration for concatenation and 1.4us per iteration for interpolation, for execution only. I'm measuring 74us per iteration for concatenation and 6us for interpolation, for parse and execution combined. Hashar's justification for replacing double quotes with single quotes throughout the MediaWiki codebase was to speed up execution for environments with no oparray caching, since the measured difference for oparray cache hits was said to be negligible.
Which brings me to this:
Jay R. Ashworth wrote:
"If a programmer can simulate a construct more efficiently than the compiler can implement it, then the compiler writer has blown it *badly*". --Guy L Steele, in Harbisone & Steele.
It's not so easy to simultaneously optimise compile speed and execution speed. Improving one often means trading off the other. An intermediate representation, even one hand-written in the source language, can escape that tradeoff.
-- Tim Starling
On Sat, Jul 29, 2006 at 07:41:38PM -0400, Tim Starling wrote:
Which brings me to this:
Jay R. Ashworth wrote:
"If a programmer can simulate a construct more efficiently than the compiler can implement it, then the compiler writer has blown it *badly*". --Guy L Steele, in Harbisone & Steele.
It's not so easy to simultaneously optimise compile speed and execution speed. Improving one often means trading off the other. An intermediate representation, even one hand-written in the source language, can escape that tradeoff.
I believe Steele's implication (from the original context, which I don't think I could find easily) was WRT final execution speeds, rather than how long they took to compile, Tim. But I see your point, in further reflection.
Cheers, -- jra
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Tim Starling wrote:
The main reason I'm doing this is because I think $foo.' '.$bar looks ugly compared to "$foo $bar", and it also takes up more screen space. I doubt the performance gain would be significant either way, concatenation takes only 19us. I just don't like seeing the MediaWiki codebase uglied up for specious reasons.
While I agree that if interpolation looks nicer then concatenation, you should definitely use it, I don't think the test results mean very much. String interpolation used to be quite slow, but the PHP team managed to speed it up quite a bit from 4.2 to 4.3 to 5 (Schlossnagle 2004).
I think it's highly dependent on server configuration: you should run the test on a production Wikimedia server and see how it fares there.
On Sat, Jul 29, 2006 at 11:08:51AM -0400, Edward Z. Yang wrote:
Tim Starling wrote:
The main reason I'm doing this is because I think $foo.' '.$bar looks ugly compared to "$foo $bar", and it also takes up more screen space. I doubt the performance gain would be significant either way, concatenation takes only 19us. I just don't like seeing the MediaWiki codebase uglied up for specious reasons.
While I agree that if interpolation looks nicer then concatenation, you should definitely use it, I don't think the test results mean very much. String interpolation used to be quite slow, but the PHP team managed to speed it up quite a bit from 4.2 to 4.3 to 5 (Schlossnagle 2004).
"If a programmer can simulate a construct more efficiently than the compiler can implement it, then the compiler writer has blown it *badly*". --Guy L Steele, in Harbisone & Steele.
"Premature optimization is the root of all evil." --Hoare, via Knuth
Cheers, -- jra
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Edward Z. Yang wrote:
While I agree that if interpolation looks nicer then concatenation, you should definitely use it, I don't think the test results mean very much. String interpolation used to be quite slow, but the PHP team managed to speed it up quite a bit from 4.2 to 4.3 to 5 (Schlossnagle 2004).
Argh, that was phrased poorly. What I meant to say is:
I agree that interpolation looks nicer than concatenation, and you should use it whenever it improves readability.
However, I don't think the test results mean very much. (insert rest of message).
On Sat, Jul 29, 2006 at 08:29:31PM +1000, Tim Starling wrote:
Hashar noticed that for simple string literals, single quotes are faster than double quotes, if you count both parse time and execution time. This is reasonably easy to understand: single quotes have only two special characters, \ and ', so the loop body to process them would be simpler. Some people went on to assert that by extension $foo.' '.$bar is faster than "$foo $bar", without doing any benchmarks. I was never convinced of this.
I'd never bothered to consider whether concatenation or string interpolation was slower -- I just knew that both would slow things down. Nice to know someone checked on it, at least for academic purposes.
I did a benchmark where I had one file which looked like this:
[ snip a bunch of data and explanation ]
The slightly longer filesize in the string_interpolation.php case was because I used line breaks embedded in the string to break up powers of 10, whereas in concatenation.php the line breaks were not in the strings.
The main reason I'm doing this is because I think $foo.' '.$bar looks ugly compared to "$foo $bar", and it also takes up more screen space. I doubt the performance gain would be significant either way, concatenation takes only 19us. I just don't like seeing the MediaWiki codebase uglied up for specious reasons.
Excellent point.
However . . .
This is almost certainly the fastest option, when using echo (which is faster than print, if I'm not mistaken, though it doesn't work as part of a more complex enclosing expression):
echo $foo, ' ', $bar;
One can feed multiple strings to echo for printing via the comma without having to wait through the concatenation operation, similarly to the way the print function works in Perl. Of course, I'm guessing: for all I know, PHP might be so obtuse as to treat multiple arguments to echo like a loop, and restart an expensive operation for every argument.
Considering I don't really use PHP when speed is a concern, though, it's unsurprising I haven't bothered to benchmark it.
On Sat, Jul 29, 2006 at 12:19:03PM -0600, Chad Perrin wrote:
This is almost certainly the fastest option, when using echo (which is faster than print, if I'm not mistaken, though it doesn't work as part of a more complex enclosing expression):
echo $foo, ' ', $bar;
One can feed multiple strings to echo for printing via the comma without having to wait through the concatenation operation, similarly to the way the print function works in Perl. Of course, I'm guessing: for all I know, PHP might be so obtuse as to treat multiple arguments to echo like a loop, and restart an expensive operation for every argument.
Considering I don't really use PHP when speed is a concern, though, it's unsurprising I haven't bothered to benchmark it.
Curiosity got the better of me, so I did a quick, unscientific benchmark using the unix time utility. Results, with script output cut out, on a script that prints "foo bar" 1499 times, once with comma separation, once with string concatenation, and once with a single interpolated string:
$ time php -f comma.php real 0m0.546s user 0m0.151s sys 0m0.070s
$ time php -f concat.php real 0m0.351s user 0m0.173s sys 0m0.025s
$ time php -f interp.php real 0m0.286s user 0m0.108s sys 0m0.025s
So, apparently, I got the opposite of what I would have expected. I think the PHP interpreter is actually running a separate echo routine for every comma-separated string. Interpolation is fastest.
I didn't add any linebreaks or spaces for code readability, so this is about as direct a performance comparison as I'm likely to get from the time utility, and this data set, I suppose.
On Sat, Jul 29, 2006 at 01:04:53PM -0600, Chad Perrin wrote:
Curiosity got the better of me, so I did a quick, unscientific benchmark using the unix time utility. Results, with script output cut out, on a script that prints "foo bar" 1499 times, once with comma separation, once with string concatenation, and once with a single interpolated string:
$ time php -f comma.php real 0m0.546s user 0m0.151s sys 0m0.070s
$ time php -f concat.php real 0m0.351s user 0m0.173s sys 0m0.025s
$ time php -f interp.php real 0m0.286s user 0m0.108s sys 0m0.025s
. . . and just for the sake of completeness, I decided to run the interpolation test with print instead of echo to double-check my belief that echo is faster. It seems to actually be roughly equivalent (execution times returned by the time utility vary by greater than the differences between interp.php and print.php). Print, of course, doesn't allow the comma syntax, but since it's the slowest I don't think that's really a problem -- though it can be formatted in a slightly less obnoxious-looking manner than concatenation. Very slightly.
$ time php -f print.php real 0m0.285s user 0m0.101s sys 0m0.032s
Chad,
$ time php -f comma.php
I'm sure you understand, that most of our string operations are happening not in print/echo ;-)
BR,
On Sat, Jul 29, 2006 at 10:24:26PM +0300, Domas Mituzas wrote:
Chad,
$ time php -f comma.php
I'm sure you understand, that most of our string operations are happening not in print/echo ;-)
Yes . . .
. . . and?
Concatenation occurs before print or echo operates on the string. I wasn't sure whether echo operated on each argument in a comma-separated list individually or as an aggregate whole, though judging by the slowness of it I'm inclined to believe the former now. Interpolation occurs outside of print or echo, and the resulting interpolated string output is operated upon by print or echo.
I'm not sure what your point was.
On Sat, Jul 29, 2006 at 10:36:07PM +0200, Platonides wrote:
"Chad Perrin" wrote:
I'm not sure what your point was.
The point was that even if echo comma were 200%, MediaWiki couldn't use it, as internal functions doesn't use echo/print, but return data to an upper level which goes to another function, which calls $wgOut....
Ahhh, that makes sense. Okay, duly noted.
Meanwhile, I got carried away and did more pseudo-benchmarking. Such fun.
wikitech-l@lists.wikimedia.org