Ok, B it is. I'll add another entry to updaters.inc when I get home and start by first converting all uses of getText in User.php to getDBkey. After the actual title stuff is built, we can track down all the places which use a displayable version of the name and make them use the displayname instead.
On another note, I guess this is my official statement on this part, but I intend to create a new class for the normalization of titles. The TitleNormalizer class.
It acts as an instance, the primary purpose of it is for use of it's normalize function. It's constructed with a default set of sequence groups and sequence passes. A few notes on that: - Because of how it sequentially goes through things it has a nicely defined order, to add another sequence inside of an area a new group can even be inserted to group sequences of another type. - The reason that the normalizer is used as an Instance, and not used statically is for optimum extensibility. There may be cases where just defining an extra sequence or two, or removing some won't be enough to make a change that you want to make. To facilitate the larger alterations to normalization someone can subclass the TitleNormalizer with a new class which includes their major normalizations, and use a Hook (Probably 'TitleNormalizerClass' or 'TitleNormalizerClassname'), to have MediaWiki instantiate a different type of class.
Also another important note. Currently secureAndSplit includes the trimming of whitespace as part of it's task before splitting interwiki and namespaces out. For various reasons I will be changing that order. Nothing will be trimmed from the title before those are split out, the prefix splitter will be responsible for temporarily trimming whitespace and other stuff out of the split text before trying to find out what the prefix is. The actual trimming of whitespace will only happen after that, and also only after the fragment is extracted to, when we know we are actually working on the title portion only. The current set of passes is actually quite hacky, as it basically trims whitespace, splits interwiki, re-trims whitespace, splits fragment, then re-trims whitespace again just to make sure that the actual title gets it's whitespace trimmed. And note that all three of those are meant for trimming the title, not the prefix or fragment, because I know at least, that the regex used to grab the prefix is specifically coded to ignore extra whitespace in the namespace/interwiki in the first place. Actually on that note, it doesn't look like there is much reason for the use of the regex. So to cut down on that, I'm going to try using normal string functions to pull out the prefixes and trim them off. A strpos, substr, and trim set together is much quicker than a full blown regex pattern match.
~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)
Simetrical wrote:
On Thu, Mar 6, 2008 at 2:43 AM, DanTMan dan_the_man@telus.net wrote:
So, we have two options: A) Hack up User.php to use getDBkey and replaces _'s with spaces instead of getText.
In particular, of course, using some nice User method that hides the ugly conversion in one place.
B) Make use of getDBkey for identification of the user and have the update script refactor the users table to use underscores like it should instead of spaces.
The idea of having separate normalized/display names makes as much sense for users as for titles, certainly. This seems like the more logical option. It's not like we aren't going to have be doing rebuilding and repopulating of the page table to do this anyway, so why not the user table too?
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l