On Wed, 29 May 2013 18:33:03 -0700, Tyler Romeo <tylerromeo(a)gmail.com>
wrote:
On Wed, May 29, 2013 at 9:26 PM, Tim Starling
<tstarling(a)wikimedia.org>wrote;wrote:
37% for the larger replacement array in
Html::expandAttributes(), or
for the smaller one in Html::element()? And what was the test case
size: how many replaced bytes compared to non-replaced bytes?
If it was the strtr() in Html::element(), which is the only one which
gives a size reduction, perhaps you should compare it against
htmlspecialchars($s, ENT_NOQUOTES), which should use the same
algorithm as plain htmlspecialchars() but with the same size reduction
as strtr().
Ran another test. I tested on the string
'<&<&<&herllowodsiojgd<&sd<^<6&&"""'
repeated 50 times, and I ran the
replacement function 500,000 times. The results were:
htmlspecialchars with ENT_NOQUOTES: 14.025s
htmlspecialchars without ENT_NOQUOTES: 13.457s
strtr: 24.842s
str_replace: 13.184s
Of course, these numbers tend to vary +/- 0.25s every time, so take it
with
a grain of salt.
*-- *
*Tyler Romeo*
Stevens Institute of Technology, Class of 2016
Major in Computer Science
www.whizkidztech.com | tylerromeo(a)gmail.com
These stats look a little less like htmlspecialchars is the most efficient
we should use it. And a little more like strtr is implemented
inefficiently, we should try using one of the other methods of string
replacement.
Reading up online:
*
http://stackoverflow.com/questions/8177296/when-to-use-strtr-vs-str-replace
*
http://micro-optimization.com/strtr-vs-str_replace
*
http://comments.gmane.org/gmane.comp.php.devel/77397
I get the impression that:
* strtr iterates and replaces character-by-character while str_replace
replaces each pair in order as if you called str_replace multiple times
just replacing rather than iterating
* strtr can safely do an `a -> b, b -> a` replacement where 'abb' becomes
'baa' while str_replace cannot
* strtr's algorithm may be even slower when the strings to be replaced are
of varying sizes
* strtr is going to be faster in PHP 5.4 as they've changed the algorithm
it uses
We aren't doing any replacements that need strtr's guarantee. As long as
our & -> & replacement is the first replacement in str_replace's array
then it should work exactly as we need it.
So it looks like we should just be replacing most of our strtr uses with
str_replace instead.
Also, I'd be interested to see those benchmarks re-run on PHP 5.4 now that
I we know that they changed the algorithm.
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [
http://danielfriesen.name/]