Most people have probably forgotten about it by now, but back at the start of march[1] there was a discussion about a partial rewrite of the Title system (though I use rewrite to heavily when I'm talking about a big backend project, even when 90% of the system stays the same). It was branched off the "[Wikitech-l] Case insensitive links (not just titles)." thread.
The branch failed twice back when I was working on it in March. Primarily I ran into the twilight zone known as branching and merging using SVN on Windows... eugh, and the branch always got messed up and was unmaintained.
Now, I've got my own laptop running Ubuntu, and branching has gotten a fair bit easier.
I've restarted the titlerewrite branch again. Anyone interested can check it out and discuss the project as well. Just to recap for those who didn't see the old discussions, as I remember there were 3 goals: A) Make MediaWiki understand the concept of titles like "iPod" By "understand", I mean [[IPod]] and [[iPod]] are still the same page, and they still show up as IPod in the url and database. However, MediaWiki also has "iPod" stored so it knows what the actual styled format of the title is. Basically this is handled by adding a column into the database. The difference between this and {{DISPLAYTITLE:...}} is that displaytitle is a parser hack, while this is an actual integrated part of the Title backend. If you rename IPod to iPod on Wikipedia with this rewrite, then Special:Allpages and everywhere else, will display "iPod" instead of IPod like it currently does. B) Replace Title::secureAndSplit with an extensible Normalizer. There are a lot of requests in bugzilla and the lists here for features like full case-insensitive titles, per-namspace case sensitivity, changing MW's _ handling to use - instead, and so on... Some of these can be handled fine with adding new configuration into MediaWiki. However, there is a point where it's unreasonable to add new configurations for minor changes to titles that people might want. The idea behind the Extensible Normalizer would be that extensions or other things would be able to modify the normalization process for titles. C) Repurpose and migrate the DISPLAYTITLE hack into it's own system. Currently the DISPLAYTITLE parser function is merely used to change things like IPod into iPod. While the introduction of A) would make the use od DISPLAYTITLE no longer necessary, people do make use of DISPLAYTITLE with restrictions turned off, or the old {{Title}} hack to further alter the title. Sometimes a wiki has a valid reason for wanting things like italics or superscript inside a small part of the title. Repurposing DISPLAYTITLE would mean changing it from something meant to tweak page title, and the html header, to just the html header with different levels of restriction. In extension to this, the plan was to move displaytitle partially out of it's limited area inside of the parser cache, and make it a separate extensible system. While it is nice to allow people to do things like changing "Lisp (programming language)" to "LISP" when restrictions are lowered, and using things like "NR-N99 /Persuader/-class droid enforcer" (note the italics) it's even better if we can give Extensons the ability to tweak the display title that way. Think of cases like a MiniWiki extension running on a wiki that runs multiple small wiki in it, like Wikia's Scratchpad wiki. It would be possible for the extension to modify the title, and perhaps even link in the root of the mini wiki.
For anyone who's taken a look at the old branch, there is one thing to note. In the old schema the plan was to use a page_title_ui column, but this time around I'm going for a rev_title_ui column. This, among other things will allow versioning of that title, but will also make deletion and undeletion a little cleaner.
As for the current branch, here's a screenshot or two: http://d.imagehost.org/view/0511/Screenshottitle_Development_Environment_tit... http://d.imagehost.org/view/0768/Screenshot-All_pages_Development_Environmen...
You can notice the title in the first, that is not done with Displaytitle. And in the latter you can see the difference between [[Title]] which has a rev_title_ui of "_title_" and [[Displaytitle]] which uses {{DISPLAYTITLE:_displaytitle_}}.
I didn't pull up any more recent screenshots, but User: pages are now affecting how a user's username is displayed around the site (including history and rc pages after a little tweak, and in the skin as well), and viewing old revisions of pages respects what the ui title was during that revision.
[1] http://lists.wikimedia.org/pipermail/wikitech-l/2008-March/thread.html
That sounds very nice! How far away is this from being ready for the MediaWiki trunk?
It's likely going to be a longterm project that won't be stable enough for general use right away.
^_^ If you're interested in the long story read on... I have a bad habit of writing to much, and the first 4 paragraphs and the first 2 sentences, and 3 words of the following text were written before I realized I should probably have just written a short reply... for completeness I continued to write after this, rotfl... ----
When I originally planned it I knew it would be a more long term project. There is a huge pile of code in MediaWiki that depends on the assumption that _ and space always map to each other, and code does some fairly ugly things as a result.
Back when I originally started the project, I ran into what would have become an insanely problematic bug if I didn't catch it. In MediaWiki a user's username is stored inside the database using spaces instead of underscores like the titles are. Unfortunately when you change getText's definition from dbkey converted to spaces, to the actual format of the title to display, you run into a bit of a problem. A fair bit of core hinges on the comparison of $title->getText() == $user->getName(). So all of a sudden, if you go and change "User:Dantman" to "User:dantman" then "dantman" no longer equals "Dantman" which is in the db, and voila, I do not /exist/ and cannot login. One of the resulting things that I had to do when /rewriting/ the title backend, was to also /rewrite/ the user backend. Because I'm using a rev_title_ui, I no longer need to worry about populating the page_title_ui with data, now I just default to NULL and use default normalization. However, I still need to solve the username issue. Basically this means I've got to change the backend storage of usernames from using spaces, to using underscores. Quite simply, I need to run a REPLACE over every row in the 6 different columns in the database that store the username.
So obviously this will also entail a schema change that'll hold it back a bit longer as Wikimedia will need to think of some way to apply it. Now that I think about it, I might add in a compatibility option, one where the User class converts spaces in the username it extracts into underscores on the PHP side, and the Job que slowly does the migration in the backend. Then large wiki could enable that option and let the change happen slowly in the background, then when it's complete (don't know how that'll be calculated yet) they can disable the option. Of course, that option won't be something enabled by default as the assumption it causes would interfere with any extension to the normalizer.
But over that, there is a fair bit of ui code that needs to be fixed, even simply because of the username change. User::getName() has been used all over the MediaWiki codebase, now that it's usage has been changed, I need to migrate some calls from using User::getName() to what for now I've dubbed User::getNameText(). I opted for using User::getName() as the keyform and making it the same as User::getTitleKey() because it's probably better for all the code lying around that uses it for comparing equality of users. Of course, there are a fair bit of places where a $title->getText() == $user->getName() needs to be changed to use $title->getDBkey().
That being said, I think I already hunted down and fixed all those cases inside of core. However, I ran into a bit of an issue. Apparently log pages or rather history and other pagers verbosely use the user_text fields, as a result I had to tweak some of the stuff in the linker to display proper user names on those pages. It is quite likely that there may still be residual stuff lying around the codebase where usernames won't display properly. I actually grazed over the blocking interface code when I was tracking down getName calls with regexxer, and I have a feeling that the blocking interface is displaying usernames a little incorrectly. I can't remember clearly, but I think I at least fixed the potential issue of renaming "User:Vandal" to "User:vandal" suddenly bypassing every block you could possibly make on them. ^_^ I don't think WP's admins would be very happy if suddenly thousands of vandals started roaming free cause they found out they could unblock their other accounts by moving their userpage, rotfl... Though I have a feeling that at some point I'm going to need to profile this and see if the extra calls for finding out the proper username to display in the linker is going to require me to make some optimizations to this.
^_^ I have a new static method Title::newFromUIText(); please head the documentation on the function and NEVER USE IT! rotfl... Title::newFromUIText, is only meant for use by something that does movepage like things. It basically overrides the ui text, whereas Title::newFromText normalizes the input and queries for what the actual ui text is when you request it. Because of the purpose of it Title::newFromUIText does not use the link cache, and isn't mass title performance friendly like Title::newFromText, this would be done because if I had used the link cache, then it would be possible to pollute that cache with bad ui texts.
I think I shall stop trying to think about more to write at this point... At least before I make a mail to long to actually submit to wikitech-l, heh...
~Daniel Friesen (Dantman, Nadir-Seen-Fire) ~Profile/Portfolio: http://nadir-seen-fire.com -The Nadir-Point Group (http://nadir-point.com) --It's Wiki-Tools subgroup (http://wiki-tools.com) --The ElectronicMe project (http://electronic-me.org) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) --Animepedia (http://anime.wikia.com) --Narutopedia (http://naruto.wikia.com)
Remember the dot wrote:
That sounds very nice! How far away is this from being ready for the MediaWiki trunk?
wikitech-l@lists.wikimedia.org