Thanks Yuri,
I know of the normalization done through the API, but it doesn't work for the case I'm working on : it's a dump analysis, and I want it to be able to work offline...
Nico
On Sun, Aug 4, 2019 at 2:12 AM Yuri Astrakhan yuriastrakhan@gmail.com wrote:
Hi Nico, if possible, can your tool to actually use MW API to normalize titles? It's a very quick API call, you can do multiple titles at once, but it will save you a lot of grief over incompatibilities. --Yuri
On Sat, Aug 3, 2019 at 10:57 AM Nicolas Vervelle nvervelle@gmail.com wrote:
Hello,
On most wikis, MediaWiki is configuration to convert the first letter of
a
title to uppercase, but apparently it's not converting every Unicode characters : for example, on frwiki ɽ https://fr.wikipedia.org/w/index.php?title=%C9%BD&redirect=no is a different article than Ɽ https://fr.wikipedia.org/wiki/%E2%B1%A4, even if the second character is the uppercase version of the first one in
Unicode.
So, what characters are actually converted to uppercase by the title normalization ?
I need to know this information to stop reporting some false positives in WPCleaner https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:WPCleaner.
Thanks, Nico _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l