On Tue, Aug 3, 2010 at 8:12 PM, soxred93 <soxred93(a)gmail.com> wrote:
(just remember that it's 1.5 to 5 times slower,
like I said earlier.
Whether or not that's an issue will have to be decided by higher powers)
This is not some question that has to be decided by
specially-appointed performance gurus -- just do some quick testing.
Like so:
$ echo '<?php $str = str_repeat( "aאπ", 200000000 ); $start =
microtime( true ); mb_strlen( $str ); var_dump( microtime( true ) -
$start );' | php
float(1.1920928955078E-5)
Note that this string is one *billion* bytes long, and the mb_strlen()
still takes only about 10 *microseconds*. If you look at our own
mb_strlen() implementation, the only non-O(1) part is count_chars(),
and for that we find:
$ echo '<?php $str = str_repeat( "aאπ", 200000000 ); $start =
microtime( true ); count_chars( $str ); var_dump( microtime( true ) -
$start );' | php
float(1.8740479946136)
I.e., less than two seconds for a one-billion-byte string. This is
about 100,000 times worse than native mb_strlen(), and about 200,000
times worse than strlen(), but on a sub-megabyte article, it's still
only a millisecond or so in absolute terms.
In the future, remember that you can run this kind of
order-of-magnitude performance assessment yourself very easily. You
*have* to, to write code that performs decently -- you can't just push
all performance considerations off to reviewers. Thankfully, it's
easy to answer this kind of performance question. Things that involve
nontrivial scalability, like database operations, are considerably
harder, and you do need to develop specific expertise to easily
estimate what performance will be like, but that's not the case here.