Speaking of AntiSpoof, there is a freshly-opened bug that could use attention: https://bugzilla.wikimedia.org/show_bug.cgi?id=19273
Thanks, -Mike
On Sat, 2009-06-20 at 10:39 +0100, Neil Harris wrote:
Andrew Dunbar wrote:
2009/6/20 Jaska Zedlik jz53zc@gmail.com:
Hello, On Fri, Jun 19, 2009 at 20:31, Rolf Lampa rolf.lampa@rilnet.com wrote:
Jaska Zedlik skrev: <...>
The code of the override function is the following:
function stripForSearch( $string ) { $s = $string; $s = preg_replace( '/\xe2\x80\x99/', ''', $s ); return parent::stripForSearch( $s ); }
I'm not a PHP programmer, but why using the extra assignment of $s instead of using $string directly in the parent call, like so:
function stripForSearch( $string ) { $s = preg_replace( '/\xe2\x80\x99/', ''', $string ); return parent::stripForSearch( $s ); }
Really, you are right, for the real function all these redundant assignments should be strepped for the productivity purposes, I just used a framework from the Japanese language class which does soma Japanese-specific reduction, but I agree with your notice.
The username anti-spoofing code already knows about a lot of "looks similar" characters which may be of some help.
Andrew Dunbar (hippietrail)
Of itself, the username anti-spoofing code table -- which I originally wrote -- is rather too thorough for this purpose, since it deliberately errs on the side of mapping even vaguely similar-looking characters to one another, regardless of character type and script system,and this, combined with case-folding and transitivity, leads to some apparently bizarre mappings that are of no practical use for any other application.
If you're interested, I can take a look at producing a more limited punctuation-only version.
-- Neil