Last question (I believe) : I've implemented something similar as Php72ToUpper in WPCleaner, and it seems to work fine for removing false positives. I've only one left on frwiki : ⅷ https://fr.wikipedia.org/w/index.php?title=%E2%85%B7&redirect=no. My code still converts it to uppercase, but on frwiki there is one page for the lowercase letter, and one page for the uppercase letter, so this letter is not converted to uppercase by current MediaWiki version. Is it missing in Php72ToUpper to prevent it to be converted with PHP 7.2 ?
Nico
On Mon, Aug 5, 2019 at 8:45 AM Nicolas Vervelle nvervelle@gmail.com wrote:
Thanks Giuseppe !
I've subscribed to T219279 to know when the pages are properly converted, and when I can remove the hack in my code.
Nico
On Mon, Aug 5, 2019 at 7:03 AM Giuseppe Lavagetto < glavagetto@wikimedia.org> wrote:
On Sun, Aug 4, 2019 at 11:34 AM Nicolas Vervelle nvervelle@gmail.com wrote:
Thanks Brian,
Great for the link to Php72ToUpper.php ! I think I understand with it : for example, the first line says 'ƀ' =>
'ƀ',
which should mean that this letter shouldn't be converted to uppercase
by
MW ? That's one of the letter I found that wasn't converted to uppercase and that was generating a false positive in my code : so it's because
specific
MW code is preventing the conversion :-)
Hi!
No, that file is a temporary measure during a transition between two versions of php.
In HHVM and PHP 5.x, calling mb_toupper("ƀ") would give the erroneous result "ƀ".
In PHP 7.x, the result is the correct capitalization.
The issue is that the titles of wiki articles get normalized, so under php7 we would have
ƀar => Ƀar
which would prevent you from being able to reach the page.
Once we're done with the transition and we go through the process of coverting the (several hundred) pages/users that have the wrong title normalization, we will remove that table, and obtain the correct behaviour.
You just need to subscribe https://phabricator.wikimedia.org/T219279 and wait for its resolution I think - most unicode horrors are fixed in recent versions of PHP, including the one you were citing.
Cheers,
Giuseppe
Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l