Case insensitive links (not just titles).

List overview All Threads
Download

newer

older

...

confirmemail_body and create...

subscribe＠divog.com.ru

28 Feb 2008 28 Feb '08

12:30 p.m.

Sorry for my English :)

What I need is case insensitive titles. My solution for the problem was to change collation in mysql from <unf8_bin> to <utf8_general_ci> in table <page>, for field <page_title>.

But bigger problem with links persists. In my case, if there is an article <Frank Dreben>, link [[Frank Dreben]] is treated like a link to an existent article (GoodLink), but link [[frank dreben]] is treated like a link to a non-existent article, so, this link opens editing of existent article <Frank Dreben>. What can be fixed for that link [[frank dreben]] to be treated like a GoodLink?

I've spent some time in Parser.php, LinkCache.php, Title.php, Linker.php, LinkBatch.php but found nothing useful. The last thing I tried was to do strtoupper on title every time array of link cache is filled, in LinkCache.php. I also tried to do strtoupper on title every time data is fetched from the array.

I've tried to make titles in cache be case insensitive, but it didn't work out, not sure why - it seems like when links are constructed (parser, title, linker, etc) only LinkCache methods are used.

Could anybody point a direction to dig in? :)

Show replies by date

DanTMan

28 Feb 28 Feb

6:43 p.m.

From my understanding Title::secureAndSplit(); is the only place where anything to do with case-sensitivity of Titles is located.

^_^ If you poke me in the right way you could probably get me to hunt down everything and create a patch to MediaWiki-trunk which would introduce two new features: * extend the global variable to allow for the options [full case-sensitivity/full case-insensitivity/first letter only case-insensitivity] while maintaining legacy support for previous configurations. * Add a new hook 'TitleCaseMods', or perhaps 'TitleSecureAndSplit', ^_^ or actually someone else should probably give me a good name for it for what location it is put in; Which would allow for extensions to make alterations to how titles are treated. This would allow for the creation of extensions which would permit things like the per-namespace case sensitivity which one group of wiki was asking for in Bugzilla at one point in time.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

subscribe@divog.com.ru wrote:

...

Hi

Sorry for my English :)

What I need is case insensitive titles. My solution for the problem was to change collation in mysql from <unf8_bin> to <utf8_general_ci> in table <page>, for field <page_title>.

But bigger problem with links persists. In my case, if there is an article <Frank Dreben>, link [[Frank Dreben]] is treated like a link to an existent article (GoodLink), but link [[frank dreben]] is treated like a link to a non-existent article, so, this link opens editing of existent article <Frank Dreben>. What can be fixed for that link [[frank dreben]] to be treated like a GoodLink?

I've spent some time in Parser.php, LinkCache.php, Title.php, Linker.php, LinkBatch.php but found nothing useful. The last thing I tried was to do strtoupper on title every time array of link cache is filled, in LinkCache.php. I also tried to do strtoupper on title every time data is fetched from the array.

I've tried to make titles in cache be case insensitive, but it didn't work out, not sure why - it seems like when links are constructed (parser, title, linker, etc) only LinkCache methods are used.

Could anybody point a direction to dig in? :)

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Simetrical

7:22 p.m.

On Thu, Feb 28, 2008 at 7:43 PM, DanTMan dan_the_man@telus.net wrote:

...

From my understanding Title::secureAndSplit(); is the only place where anything to do with case-sensitivity of Titles is located.

Explicitly, yeah, but any associative array using title strings as keys will automatically be case-sensitive, just because array lookups (and string comparisons generally) are case-sensitive. I have no idea how many of those there are scattered about.

I really want some robust and generic normalization mechanism. Instead of distinguishing between display titles and DB keys (which is pointless: as though we can't store spaces in the database?), distinguish between display titles and normalized titles. Normalized titles would be stored separately in the database and used for lookup and uniqueness checking, as well as in URLs, and are formed by applying a canonical function to the display title. Then titles can have underscores in them, for instance, in the default configuration (just they'd be normalized to underscores), and someone who wanted to muck around a bit could use all sorts of weird conventions if they liked just by changing the normalization function and rebuilding the page table.

DanTMan

29 Feb 29 Feb

6:06 p.m.

^_^ Complete Title backend rewrite!? A Image backend rewrite is being worked on, why not start one for the Title class as a separate project. We could compile a list of useful features in the Title system people want that we currently don't have. And come up with the most optimum way to deal with titles. However, I'm not a fan of storing both a normalized underscore version of the title, and a un-normalized space version of the title. I'm thinking display title for display, and normalized title for all the handling and other things. I think having the {{DISPLAYTITLE:}} function store the display title inside of the page table would be best. And if we made the normalized version depend on the display title then it wouldn't be possible for someone to remove the requirement that the displaytitle needs to normalize to the actual title. Some wiki would like to have that not there, and have a subtitle added when they don't match. So DISPLAYTITLE and PAGETITLE stored in the database I would think. Or we could actually to a tripple, we could decide what would be best after considering all the possible features people might want to be able to add into the title system, and consider various hooks to add which would allow people to create Title modifying extensions without hacking core.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Simetrical wrote:

...

On Thu, Feb 28, 2008 at 7:43 PM, DanTMan dan_the_man@telus.net wrote:

...
From my understanding Title::secureAndSplit(); is the only place where anything to do with case-sensitivity of Titles is located.

Explicitly, yeah, but any associative array using title strings as keys will automatically be case-sensitive, just because array lookups (and string comparisons generally) are case-sensitive. I have no idea how many of those there are scattered about.

I really want some robust and generic normalization mechanism. Instead of distinguishing between display titles and DB keys (which is pointless: as though we can't store spaces in the database?), distinguish between display titles and normalized titles. Normalized titles would be stored separately in the database and used for lookup and uniqueness checking, as well as in URLs, and are formed by applying a canonical function to the display title. Then titles can have underscores in them, for instance, in the default configuration (just they'd be normalized to underscores), and someone who wanted to muck around a bit could use all sorts of weird conventions if they liked just by changing the normalization function and rebuilding the page table.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Simetrical

1 Mar 1 Mar

6:47 p.m.

On Fri, Feb 29, 2008 at 7:06 PM, DanTMan dan_the_man@telus.net wrote:

...

However, I'm not a fan of storing both a normalized underscore version of the title, and a un-normalized space version of the title. I'm thinking display title for display, and normalized title for all the handling and other things. I think having the {{DISPLAYTITLE:}} function store the display title inside of the page table would be best. And if we made the normalized version depend on the display title then it wouldn't be possible for someone to remove the requirement that the displaytitle needs to normalize to the actual title. Some wiki would like to have that not there, and have a subtitle added when they don't match.

First of all, DISPLAYTITLE is a hack that should be removed in favor of just using the move function, if this gets implemented and that becomes possible. (Thanks to Rob, it's a much better hack than what we used to have, but it's still a hack.) The interface for adding it makes no sense -- to change the title you should move the page. Having your perfectly sensible new page name be mangled in terms of capitalization and '_' => ' ' is uninituitive, and DISPLAYTITLE is not discoverable as a mechanism for evading it. It should Just Work when you create a page with an underscore in its name.

Its implementation is also horribly incomplete. *Everything* in the user interface should know about the display title, and use it. Because it's currently stored in the page text, nothing knows about it except when the page itself is actually being displayed. The display title *has* to be stored in its own normalized database field for arbitrary parts of code to have access to it.

As for wikis that want the normalized title displayed in a subtitle or something, that's something an extension can implement using hooks as an entirely separate mechanism. It's not relevant to this discussion, IMO, especially if no one has any examples.

On Sat, Mar 1, 2008 at 5:42 AM, subscribe@divog.com.ru wrote:

...

Is there many of them - such things? The only one I found was LinkCache class. Parser, Linker, Title use only methods of LinkCache, when it's about Good|BadLinks. Maybe there are no other cases of use title string as keys of associative array?

It could be. But the general principle is, everyone's assumed titles are case-sensitive until now, so you're probably going to find lots of random places where that assumption is built in in various ways. Hopefully not an unmanageably large number, but probably more than just one or two.

DanTMan

9:49 p.m.

Well, if you've checked any number of active wiki, you're likely to run into the {{Title}} hack. Last I checked wiki like Wookiepedia and Uncyclopedia which are only second to the Wikimedia wiki in size have been using it for ages. And there are a few bugzilla entries asking for the functionality to. So it's not something void of examples, use, or demand: http://starwars.wikia.com/wiki/Star_Wars_Episode_III:_Revenge_of_the_Sith http://starwars.wikia.com/wiki/NR-N99_Persuader-class_droid_enforcer http://starwars.wikia.com/wiki/Acclamator_I-class_assault_ship http://uncyclopedia.org/wiki/Communism http://uncyclopedia.org/wiki/Game:Zork/knife http://uncyclopedia.org/wiki/Death https://bugzilla.wikimedia.org/show_bug.cgi?id=12998

I can go for allowing MediaWiki to handle case, space/underscore, and extra padding issues (Extra padding as in titles like _Summer, which have valid uses http://en.wikipedia.org/wiki/Underbar_Summer) natively in a title rewrite. And having an extension handle the extra cases like WikiMarkup in titles (Italics, Bolding, and class/styling of titles), stripping ()'s, allowing # for display, and other off uses which would require the use of a subtitle. However, to reduce the complaints and negative comments. Perhaps we should actually build that extension along-side a proper title rewrite as a Proof of Point, that it can be done without making it an absolute hack like it is. Also, it would let us compile a full list of all the possible and already desired features for Titles, and then dictate which ones MediaWiki should support natively, and which ones should be something only allowed with an installed extension. Keep the code clean, but give the public the features they want.

Btw, DISPLAYTITLE did previously allow for off titles and did add the subtitle. Some wiki were actually making use of that as a feature awhile back and complained when it was /Fixed/ to never allow that whatsoever. Without even letting people allow it using a config variable.

On a similar note, there's another feature which is used in some cases: http://www.mediawiki.org/wiki/Extension:Ascii_Translit That idea of allowing extensions to change the normalization process would void out the use of that extension, and allow for that kind of functionality without making it a hack, or needing to use redirects or double pages.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Simetrical wrote:

...

On Fri, Feb 29, 2008 at 7:06 PM, DanTMan dan_the_man@telus.net wrote:

...
However, I'm not a fan of storing both a normalized underscore version of the title, and a un-normalized space version of the title. I'm thinking display title for display, and normalized title for all the handling and other things. I think having the {{DISPLAYTITLE:}} function store the display title inside of the page table would be best. And if we made the normalized version depend on the display title then it wouldn't be possible for someone to remove the requirement that the displaytitle needs to normalize to the actual title. Some wiki would like to have that not there, and have a subtitle added when they don't match.

First of all, DISPLAYTITLE is a hack that should be removed in favor of just using the move function, if this gets implemented and that becomes possible. (Thanks to Rob, it's a much better hack than what we used to have, but it's still a hack.) The interface for adding it makes no sense -- to change the title you should move the page. Having your perfectly sensible new page name be mangled in terms of capitalization and '_' => ' ' is uninituitive, and DISPLAYTITLE is not discoverable as a mechanism for evading it. It should Just Work when you create a page with an underscore in its name.

Its implementation is also horribly incomplete. *Everything* in the user interface should know about the display title, and use it. Because it's currently stored in the page text, nothing knows about it except when the page itself is actually being displayed. The display title *has* to be stored in its own normalized database field for arbitrary parts of code to have access to it.

As for wikis that want the normalized title displayed in a subtitle or something, that's something an extension can implement using hooks as an entirely separate mechanism. It's not relevant to this discussion, IMO, especially if no one has any examples.

On Sat, Mar 1, 2008 at 5:42 AM, subscribe@divog.com.ru wrote:

...
Is there many of them - such things? The only one I found was LinkCache class. Parser, Linker, Title use only methods of LinkCache, when it's about Good|BadLinks. Maybe there are no other cases of use title string as keys of associative array?

It could be. But the general principle is, everyone's assumed titles are case-sensitive until now, so you're probably going to find lots of random places where that assumption is built in in various ways. Hopefully not an unmanageably large number, but probably more than just one or two.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Simetrical

10:35 p.m.

On Sat, Mar 1, 2008 at 10:49 PM, DanTMan dan_the_man@telus.net wrote:

...

Well, if you've checked any number of active wiki, you're likely to run into the {{Title}} hack. Last I checked wiki like Wookiepedia and Uncyclopedia which are only second to the Wikimedia wiki in size have been using it for ages.

What is that, a JavaScript hack? Looks to be. This won't interfere with it.

...

However, to reduce the complaints and negative comments. Perhaps we should actually build that extension along-side a proper title rewrite as a Proof of Point, that it can be done without making it an absolute hack like it is.

I want to improve a certain class of functionality in certain ways. You want to improve it even more. That's fine, but it's not what I'm focusing on right now. I'm not as interested in the further improvements you propose, and I don't see why you would think they should be a requirement for implementing the smaller set of improvements I suggested.

...

On a similar note, there's another feature which is used in some cases: http://www.mediawiki.org/wiki/Extension:Ascii_Translit That idea of allowing extensions to change the normalization process would void out the use of that extension, and allow for that kind of functionality without making it a hack, or needing to use redirects or double pages.

That would be an immediate application for a custom normalization function, yes, in the setup I envision. Not that I think anyone will do it anytime soon.

DanTMan

10:51 p.m.

The smaller set of improvements you propose will likely require a large amount of change to the MediaWiki code. Which is more sane? * Editing a large amount of code to make small changes. And then ending up finding out that further improvements can't be made without hacks and needing to edit a large amount of code again. * One group editing a large amount of code to make small changes, at the same time that another group decides to do something similar yet incompatible with the other than extends the functionality in another way. * Or one group editing a large amount of code to make small changes at the same time as opening up the ability to improve that further without the use of hacks.

I noted the Title hack because as it is, both the css and js versions are complete hacks, the DISPLAYTITLE function was created to try and stop people from using those hacks by giving functionality for it inside of MediaWiki itself. However as you see, people are still using the Title hack and haven't stopped using it despite the fact that DISPLAYTITLE exists, that shows that there is something left to be desired in the current implementation before people are going to stop using ugly hacks on common wiki.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Simetrical wrote:

...

On Sat, Mar 1, 2008 at 10:49 PM, DanTMan dan_the_man@telus.net wrote:

...
Well, if you've checked any number of active wiki, you're likely to run into the {{Title}} hack. Last I checked wiki like Wookiepedia and Uncyclopedia which are only second to the Wikimedia wiki in size have been using it for ages.

What is that, a JavaScript hack? Looks to be. This won't interfere with it.

...
However, to reduce the complaints and negative comments. Perhaps we should actually build that extension along-side a proper title rewrite as a Proof of Point, that it can be done without making it an absolute hack like it is.

I want to improve a certain class of functionality in certain ways. You want to improve it even more. That's fine, but it's not what I'm focusing on right now. I'm not as interested in the further improvements you propose, and I don't see why you would think they should be a requirement for implementing the smaller set of improvements I suggested.

...
On a similar note, there's another feature which is used in some cases: http://www.mediawiki.org/wiki/Extension:Ascii_Translit That idea of allowing extensions to change the normalization process would void out the use of that extension, and allow for that kind of functionality without making it a hack, or needing to use redirects or double pages.

That would be an immediate application for a custom normalization function, yes, in the setup I envision. Not that I think anyone will do it anytime soon.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

DanTMan

11:40 p.m.

What I'm saying is this:

Title's supporting various bits of text and other such stuff is not a small thing, it's a feature and issue which has been around for awhile and is something that a large number of people are involved in. What one developer things is a way to solve the issue may not be what others may think, and it may not even be the best way.

What I'm saying is, that with something with as large an involvement as this, rather than one dev making a small change to how things worked, we should get input from many of those who are involved on what is needed, and what is the best way to go about it all.

And I'm not saying that adding the extension functionality is something for you to do in addition. I'm saying that this could be best done as multiple people working on different parts at the same time, and making sure that the different parts are compatible with each other and work cleanly instead of someone making a big hack later (Isn't changing a small bit of functionality at one point and a hack needing to be created later the whole reason we got into this whole big DISPLAYTITLE mess in the first place? Repeating the past isn't good). I'm even fine with being the one to do the extension stuff, while working with you to make sure both our changes work together rather than breaking each other, or locking the others features out and limiting people to pick between.

Next, I'm not saying that both things coincide. In fact, we've been talking in the notion that there are two types of titles, while ignoring what's really there. There are three types of titles. * Title key - keeps the complete normalized form. Used for uniqueness checking, finding things, and such. * Real title - keeps information on what the real padding, case, and characters are actually inside of the title. Used in clean display of the title and this is what is normalized to create the title key. * Display title - this is what we actually display to the user, rather than a bunch of technical limitations, the point is to make the display suit the reader's eyes and deliver a name in a understandable means. This may or may not be completely unique, and if it doesn't normalize to the title key like the real title does, then some notification should be added to make sure that bad links aren't created. In fact, rather than just "Link with: Foo", we could output something like "Link with: [[Foo|'''F'''oo]]" which considers limited parts of the displaytitle (only italic and bold should be considered if markup is allowed) as well as the real title to create proper links that can actually be used in the best manor.

The key and display title I've been talking about is the key and display to the user, what you've been talking about is actually the key and the real title of the article. We should be considering key (backend use), real (inline display), and display (title header display) rather than just two of the three.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Simetrical

2 Mar 2 Mar

9 a.m.

On Sat, Mar 1, 2008 at 11:51 PM, DanTMan dan_the_man@telus.net wrote:

...

Which is more sane?

Editing a large amount of code to make small changes. And then ending

up finding out that further improvements can't be made without hacks and needing to edit a large amount of code again.

One group editing a large amount of code to make small changes, at the

same time that another group decides to do something similar yet incompatible with the other than extends the functionality in another way.

Or one group editing a large amount of code to make small changes at

the same time as opening up the ability to improve that further without the use of hacks.

The third, which is why I never said the mechanism shouldn't be perfectly extensible. It should be.

On Sun, Mar 2, 2008 at 12:40 AM, DanTMan dan_the_man@telus.net wrote:

...

And I'm not saying that adding the extension functionality is something for you to do in addition. I'm saying that this could be best done as multiple people working on different parts at the same time, and making sure that the different parts are compatible with each other and work cleanly instead of someone making a big hack later (Isn't changing a small bit of functionality at one point and a hack needing to be created later the whole reason we got into this whole big DISPLAYTITLE mess in the first place? Repeating the past isn't good). I'm even fine with being the one to do the extension stuff, while working with you to make sure both our changes work together rather than breaking each other, or locking the others features out and limiting people to pick between.

I think you're assuming I'm actually going to do this. I doubt I am, for the foreseeable future. I don't have the time to do much serious hacking. I was just expressing a fond wish.

...

Next, I'm not saying that both things coincide. In fact, we've been talking in the notion that there are two types of titles, while ignoring what's really there. There are three types of titles.

Title key - keeps the complete normalized form. Used for uniqueness

checking, finding things, and such.

Real title - keeps information on what the real padding, case, and

characters are actually inside of the title. Used in clean display of the title and this is what is normalized to create the title key.

Display title - this is what we actually display to the user, rather

than a bunch of technical limitations, the point is to make the display suit the reader's eyes and deliver a name in a understandable means. This may or may not be completely unique, and if it doesn't normalize to the title key like the real title does, then some notification should be added to make sure that bad links aren't created. In fact, rather than just "Link with: Foo", we could output something like "Link with: [[Foo|'''F'''oo]]" which considers limited parts of the displaytitle (only italic and bold should be considered if markup is allowed) as well as the real title to create proper links that can actually be used in the best manor.

The second and third titles you name may or may not be required to coincide. Permitting them not to (i.e., allowing the display title not to normalize to the title key, and/or permitting odd things like HTML in the display title) raises its own set of difficulties that will require a lot more thought than the initial proposal, and go a lot further. And I don't think they should be in core.

But I think this discussion has gotten to the point where it may as well stop, unless someone says they're willing to write the code. Further argument over implementation details is probably not very productive without anyone seriously considering an implementation.

David Gerard

9:06 a.m.

On 02/03/2008, Simetrical Simetrical+wikilist@gmail.com wrote:

...

But I think this discussion has gotten to the point where it may as well stop, unless someone says they're willing to write the code. Further argument over implementation details is probably not very productive without anyone seriously considering an implementation.

If someone could write this thread up for mediawiki.org, that would be most helpful for others in the future. (When I get silly requests for our work wiki, I look on mediawiki.org first.)

- d.

DanTMan

8:28 p.m.

:/ oh, now you just poked the coding bone... http://uploads.screenshot-program.com/upl9489088627.png First note, NO there is no {{DISPLAYTITLE:}} inside that page.

Though this is just a partial change. Only some parts are done, and others missing.

Now onto the actual notes: - It may be a compatibility break for some extensions which are doing things they aren't supposed to do and using $title->mTextform instead of $title->getText() because when a title is initialized now with a DB Key the Textform is left as a null string and will be grabbed on the fly in getText when we need it. (Avoids excessive database queries) (Yes there is internal use of getText instead of mTextform to avoid using null titles where they shouldn't be used) - makeTitle now accepts a third optional parameter, the realtitle of a title. This way we can initialize both when we have them and it'll be there for when it's needed. - equals now accepts a second optional parameter similar to the $valid parameter we use in User:: stuff. It defaults to 'key' but if you pass 'real' to it, it'll compare real tittles instead. This is for use in page move interfaces so that we can move titles from say [[Main Page]] to [[Main_Page]]. - I do have the update and table sql and other stuff already in to add the needed page_real field (Hope no-one minds I used an AFTER page_title in the SQL patch to keep reading of the tables clean) - However, it's not yet added for use. The stuff you see in the demo is actually done by doing some mugging of mDbkey to initialize mTextform when getText is called. - I don't have the normalization stuff inside yet. A lot of str_replaces are going to be replaced with the extend able normalization and others removed because they're trying to backconvert where they shouldn't. - Also, as you can see the functions for subpage names and basepage names will need some tweaking to differentiate between the _ and _E forms which should actually use the realtitle and titlekey forms respectively rather than just the textform.

A note on DISPLATITLE: Yes the DISPLAYTITLE is a hack, however it's widely used already. So I won't be dropping support in the rewrite, otherwise current uses will break. I'm going to come up with a maintenance script to populate the page_real fields, and another which will hunt down every page with a DISPLAYTITLE in it, and then move it to a proper title, and if possible try and remove the DISPLAYTITLE from it if the script tells it to. (Though, something like this can never be made to not leave cruft behind, so I'd suggest Wikimedia wiki should do moves by hand rather than trying this automatically. Especially since they use things like {{Lowercase}} rather than hardcoded displaytitles). Because DISPLAYTITLE already exists, rather than marking it as depreciated or to be removed I'll try and make it a little less hacky, and instead turn it into a function meant for extension of title displays into a third type of title only meant for display when viewing the page, not for other interface elements. Note that extension of title displays is means a few things: * Rather than DISPLAYTITLE doing everything, it's actually merely going to call another set of stuff meant for displaytitle stuff (Meaning that extensions can change the displaytitle in the background without needing DISPLAYTITLE everywhere). * The Displaytitle, unlike how it currently is done, will never show up inside of the Pagetitle, the realtitle is what will show up in the pagetitle (So wiki will want to move current titles using DISPLAYTITLE to actual realtitles to have the current stuff inside the title show up). * The purpose of a Displaytitle will not be for minor title things like iPod or _Summer, but will actually be meant for things like Foo #1, Lisp instead of Lisp (Programming language), Foo/Bar/, and Miniwiki or use of MediaWiki in an alternative use where they actually use a special title format and then modify how it looks by perhaps using a directory > like

...

structure and linking previous portions of the title.

The current implementation does not allow for extensions to extend what a DISPLAYTITLE actually is. I'll make a proof of point extension or two for common use to test it out and satisfy a few people who are complaining about the new restrictions to DISPLAYTITLE.

Oh, off topic but... No-one probably noticed it because it isn't used anywhere inside of the code. But on Line 321 of includes/Title.php the definition for the Title::nameOf function is missing the "public static" that should be there. It's not used, but someone's going to get a big shock when they try and use the function that says it's static but they need an arbitrary instance to use it.

Well on topic... Could Brion or someone the like give me SVN Commit access and create a /branches/titlerewrite for this to be worked on in?

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

David Gerard wrote:

...

On 02/03/2008, Simetrical Simetrical+wikilist@gmail.com wrote:

...
But I think this discussion has gotten to the point where it may as well stop, unless someone says they're willing to write the code. Further argument over implementation details is probably not very productive without anyone seriously considering an implementation.

If someone could write this thread up for mediawiki.org, that would be most helpful for others in the future. (When I get silly requests for our work wiki, I look on mediawiki.org first.)

d.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Simetrical

9:04 p.m.

On Sun, Mar 2, 2008 at 9:28 PM, DanTMan dan_the_man@telus.net wrote:

...

Oh, off topic but... No-one probably noticed it because it isn't used anywhere inside of the code. But on Line 321 of includes/Title.php the definition for the Title::nameOf function is missing the "public static" that should be there. It's not used, but someone's going to get a big shock when they try and use the function that says it's static but they need an arbitrary instance to use it.

Well, actually PHP will just give an E_STRICT notice when you try to use a non-static method statically. :)

DanTMan

6 Mar 6 Mar

1:43 a.m.

:/ And I think I found out something even worse than E_STRICT...

I have no clue who came up with the dumb idea, but all of User.php is using getText(); instead of getDBkey(); Which is insanely stupid, because getText is supposed to output text for display, getDBkey is supposed to output the version of the text which should be used for unique identification. Unfortunately... Instead of relying on functional output, all of User.php is relying on the assumption that the display version of the text will always be as static as the actual unique identifying key.

Practical point? If you move [[User:Username]] to [[User:username]], because getText now outputs "username" instead of "Username", Username now cannot login to the wiki.

So, we have two options: A) Hack up User.php to use getDBkey and replaces _'s with spaces instead of getText. B) Make use of getDBkey for identification of the user and have the update script refactor the users table to use underscores like it should instead of spaces. I'm in strong favor of B. If there is a place which aims for display of a user's name we can also make use of getText, this will also have the impressive benefit that if you move User:Username to User:_username the software will go and display "_username" instead of "Username". So users who like a special form of their username will actually be able to make the interface display that instead of a normalized form with spaces.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Simetrical wrote:

...

On Sun, Mar 2, 2008 at 9:28 PM, DanTMan dan_the_man@telus.net wrote:

...
Oh, off topic but... No-one probably noticed it because it isn't used anywhere inside of the code. But on Line 321 of includes/Title.php the definition for the Title::nameOf function is missing the "public static" that should be there. It's not used, but someone's going to get a big shock when they try and use the function that says it's static but they need an arbitrary instance to use it.

Well, actually PHP will just give an E_STRICT notice when you try to use a non-static method statically. :)

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

T.W.A.Maaswinkel＠rn.rabobank.nl

7:05 a.m.

New subject: Ajaxsearch is casesensitive

Hey all, We just noticed that the current Ajax Search is case sensitive (for all characters, except the first). Does anyone have any idea how to make this case IN-sensitive? Greetz, Tom Maaswinkel Developer @ Rabobank the Netherlands

================================================ De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Rabobank Nederland is een handelsnaam van de Cooperatieve Centrale Raiffeisen-Boerenleenbank B.A.Rabobank Nederland staat ingeschreven bij de K.V.K. onder nr. 30046259 ================================================ The information contained in this message may be confidential and is intended to be exclusively for the addressee. Should you receive this message unintentionally, please do not use the contents herein and notify the sender immediately by return e-mail. Rabobank Nederland is a trade name of Cooperatieve Centrale Raiffeisen-Boerenleenbank B.A. Rabobank Nederland is registered by the Chamber of commerce under nr. 30046259

Simetrical

8:55 a.m.

New subject: Ajaxsearch is casesensitive

2008/3/6 T.W.A.Maaswinkel@rn.rabobank.nl:

...

We just noticed that the current Ajax Search is case sensitive (for all characters, except the first). Does anyone have any idea how to make this case IN-sensitive?

Case-insensitivity of titles (in some form or other) is something that's been on the table for a long time now, and is currently under discussion. It's probably possible to do somehow, but I wouldn't bet on any way at present that's both efficient and non-buggy, without a lot of work.

Thomas Bleher

12:13 p.m.

New subject: Ajaxsearch is casesensitive

* T.W.A.Maaswinkel@rn.rabobank.nl [2008-03-06 14:06]:

...

We just noticed that the current Ajax Search is case sensitive (for all characters, except the first). Does anyone have any idea how to make this case IN-sensitive?

Use the TitleKey extension[1]. It's easily hackable, so for my site[2] I changed it to also collate umlauts (ae is considered the same as ä) and made it so it finds any string matches (not just at the beginning). The last feature is of course more database intensive and only usable on smaller wikis.

Regards, Thomas

[1]: http://www.mediawiki.org/wiki/Extension:TitleKey [2]: http://spiele.j-crew.de , search e.g. for "Räuber" or "TiP" to see it in action.

Simetrical

8:47 a.m.

On Thu, Mar 6, 2008 at 2:43 AM, DanTMan dan_the_man@telus.net wrote:

...

So, we have two options: A) Hack up User.php to use getDBkey and replaces _'s with spaces instead of getText.

In particular, of course, using some nice User method that hides the ugly conversion in one place.

...

B) Make use of getDBkey for identification of the user and have the update script refactor the users table to use underscores like it should instead of spaces.

The idea of having separate normalized/display names makes as much sense for users as for titles, certainly. This seems like the more logical option. It's not like we aren't going to have be doing rebuilding and repopulating of the page table to do this anyway, so why not the user table too?

DanTMan

11:13 a.m.

Ok, B it is. I'll add another entry to updaters.inc when I get home and start by first converting all uses of getText in User.php to getDBkey. After the actual title stuff is built, we can track down all the places which use a displayable version of the name and make them use the displayname instead.

On another note, I guess this is my official statement on this part, but I intend to create a new class for the normalization of titles. The TitleNormalizer class.

It acts as an instance, the primary purpose of it is for use of it's normalize function. It's constructed with a default set of sequence groups and sequence passes. A few notes on that: - Because of how it sequentially goes through things it has a nicely defined order, to add another sequence inside of an area a new group can even be inserted to group sequences of another type. - The reason that the normalizer is used as an Instance, and not used statically is for optimum extensibility. There may be cases where just defining an extra sequence or two, or removing some won't be enough to make a change that you want to make. To facilitate the larger alterations to normalization someone can subclass the TitleNormalizer with a new class which includes their major normalizations, and use a Hook (Probably 'TitleNormalizerClass' or 'TitleNormalizerClassname'), to have MediaWiki instantiate a different type of class.

Also another important note. Currently secureAndSplit includes the trimming of whitespace as part of it's task before splitting interwiki and namespaces out. For various reasons I will be changing that order. Nothing will be trimmed from the title before those are split out, the prefix splitter will be responsible for temporarily trimming whitespace and other stuff out of the split text before trying to find out what the prefix is. The actual trimming of whitespace will only happen after that, and also only after the fragment is extracted to, when we know we are actually working on the title portion only. The current set of passes is actually quite hacky, as it basically trims whitespace, splits interwiki, re-trims whitespace, splits fragment, then re-trims whitespace again just to make sure that the actual title gets it's whitespace trimmed. And note that all three of those are meant for trimming the title, not the prefix or fragment, because I know at least, that the regex used to grab the prefix is specifically coded to ignore extra whitespace in the namespace/interwiki in the first place. Actually on that note, it doesn't look like there is much reason for the use of the regex. So to cut down on that, I'm going to try using normal string functions to pull out the prefixes and trim them off. A strpos, substr, and trim set together is much quicker than a full blown regex pattern match.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Simetrical wrote:

...

On Thu, Mar 6, 2008 at 2:43 AM, DanTMan dan_the_man@telus.net wrote:

...
So, we have two options: A) Hack up User.php to use getDBkey and replaces _'s with spaces instead of getText.

In particular, of course, using some nice User method that hides the ugly conversion in one place.

...
B) Make use of getDBkey for identification of the user and have the update script refactor the users table to use underscores like it should instead of spaces.

The idea of having separate normalized/display names makes as much sense for users as for titles, certainly. This seems like the more logical option. It's not like we aren't going to have be doing rebuilding and repopulating of the page table to do this anyway, so why not the user table too?

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Platonides

8 Mar 8 Mar

8:56 a.m.

DanTMan wrote:

...

So to cut down on that, I'm going to try using normal string functions to pull out the prefixes and trim them off. A strpos, substr, and trim set together is much quicker than a full blown regex pattern match.

Not always. Remember that the PHP code surrounding that functions is interpreted, while the regex call is run on compiled code. I think some of the sysadmins remarked that the use of regex *improved* the perfomance. I'm not saying you shouldn't change it to traditional means, just that time should be checked to be sure it's not slower.

DanTMan

3:41 p.m.

Pherhaps, however looking over the regex: /^(.+?)_*:_*(.*)$/S Checking over it with my regex tool, I notice that when it encounters a _ it ends up doubling back over it when not followed by a : . Not to mention that for each character it needs to check that it's any character, or an underscore/followed by any, and if that's followed by a : . I think that using a string function to find the first : in the string (CPU's are best at incrementation so that's nothing), and then trimming would be faster than using the regex.

Oh, ya, also there is something to remember. With the new format of normalization the splitting should NOT trim whitespace as the current setup does. If that were done then it would be eliminating whitespace from the title which someone's altered normalization may actually wish to keep. So a altered version of that regex to suit, would be: /^(.+?):(.*)$/S which most definitely is no where near as efficient as a simple find : and split. I'll probably use list() and explode() actually.

On another note, I noticed something with the normalization. While : is the standard separator, abstracting the normalization process like this is actually loosening the definition of what is what in a title, while still keeping it stable. Honestly, if someone changed the methods used to prefix things, and altered the splitting sequence, someone could probably change MediaWiki to use something like :: as the separator instead. If they went to even more work, they could probably introduce a special type of Namespace to MediaWiki which could use a different kind of prefix, or even restrict to inclusion of only certain types of pages. (Basically, wiki like card game wiki could force their package redirects and card ids into special namespaces dedicated to them). Actually, in light of that, I might add another hook or two, or clean up some of the title functions to properly abstract the prefixing to where it should be instead of mixing it up all over the place.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Platonides wrote:

...

DanTMan wrote:

...
So to cut down on that, I'm going to try using normal string functions to pull out the prefixes and trim them off. A strpos, substr, and trim set together is much quicker than a full blown regex pattern match.

Not always. Remember that the PHP code surrounding that functions is interpreted, while the regex call is run on compiled code. I think some of the sysadmins remarked that the use of regex *improved* the perfomance. I'm not saying you shouldn't change it to traditional means, just that time should be checked to be sure it's not slower.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Simetrical

6:53 p.m.

On Sat, Mar 8, 2008 at 9:56 AM, Platonides Platonides@gmail.com wrote:

...

DanTMan wrote:

...
So to cut down on that, I'm going to try using normal string functions to pull out the prefixes and trim them off. A strpos, substr, and trim set together is much quicker than a full blown regex pattern match.

Not always. Remember that the PHP code surrounding that functions is interpreted, while the regex call is run on compiled code. I think some of the sysadmins remarked that the use of regex *improved* the perfomance. I'm not saying you shouldn't change it to traditional means, just that time should be checked to be sure it's not slower.

Or you should just ignore the difference and use whichever you think is easier to read. There's no point in micro-optimization like this, unless you have reason to believe that the particular functions are important to performance.

Platonides

9 Mar 9 Mar

9:01 a.m.

Simetrical wrote:

...

On Sat, Mar 8, 2008 at 9:56 AM, Platonides wrote:

...
DanTMan wrote:

...
So to cut down on that, I'm going to try using normal string functions to pull out the prefixes and trim them off. A strpos, substr, and trim set together is much quicker than a full blown regex pattern match.

Not always. Remember that the PHP code surrounding that functions is interpreted, while the regex call is run on compiled code. I think some of the sysadmins remarked that the use of regex *improved* the perfomance. I'm not saying you shouldn't change it to traditional means, just that time should be checked to be sure it's not slower.

Or you should just ignore the difference and use whichever you think is easier to read. There's no point in micro-optimization like this, unless you have reason to believe that the particular functions are important to performance.

Easiness to read could be a good reason to change that. I just warned against changing for optimization reasons.

DanTMan

8 Mar 8 Mar

8:34 p.m.

Ok, new issue...

I've changed usage inside of User.php from getText and getPrefixedText to getDBkey and getPrefixedDBkey.

However I notice an issue with the User functions themselves: We have a getName function, and additionally for awhile we've had a getTitleKey, but other than a single occurrence inside of Article.php, it's widely unused. That means that getName is used for both uniqueness testing, and display.

Obviously, because usernames are now stored in key form rather than text form we are going to need to separate the functional use of functions inside the User class for backend and display use.

We've got a few options here: A) Create a new function getTitle which returns the title object which the User matches, and make use of it's functions for the standard displays and other things. Of course this is bascialy the same as getUserPage. B) Change getName's definition to be the display form of a user's name, and getTitleKey to be the key form of the user's name. And change the large number of comparison functions inside of MediaWiki to use getTitleKey instead of getName. C) Create a new function getDisplayName, and have getName's definition changed to the key form of the user's name, and getDisplayName as the display form of the user's name. Depreciate the use of getTitleKey because of it's lack of use or need anymore (changing a single reference to it). And change the uses of getName as a display value inside of MediaWiki to use getDisplayName instead. D) Create two new functions for the key and display forms of the user's name. And depreciate the old functions, slowly changing use of them to the new functions in MW to keep compatibility.

I'm probably in favor of C, as if it's definition is changed to key form, all the backend testing and stuff will still work fine, and we then will only need to worry about changing the areas that the username is used for display. Which won't really be a problem if we miss anything, because if we end up missing one conversion, the system will simply be displaying a semi-ugly form with underscores inside of it and none of the case stuff picked by the user. And it'll still be likable and won't break anything in the backend.

Oh, ^_^ an interesting new ability due to switching to key form: SELECT user_editcount FROM `user`, `page` WHERE user_name=page_title AND page_namespace=2 AND page_id=3 In this example, page_id 3 is the userpage of my user's userpage. What does it do? Well, if you were on the userpage of a user and just had the page ID, you could now easily grab any information from the database on that user himself because page_name and user_name are stored in the same format instead of different formats. ^_^ Of course, this example isn't to useful, but I'm sure someone will find some use for the two being a match now.

Though it looks like I'm also going to have to add some more database stuff. * archives is going to need a ar_real field so that deleting a page doesn't break it's titling. * rev_user_text is going to need the same conversion that user_name underwent, and I'll also need to do some stuff inside the backend to change how the name of the user is displayed. * same goes for image_user_text, ipb_by_text, oi_user_text, rc_user_text, and ar_user_text

Oh, minor off topic. But what about putting a log_user_text in at some point. Honnestly I know of a few extensions which intended to allow certain things to be done by anons in addition to normal users, but which broke when anons were allowed use of them because anon users were not properly logged.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

DanTMan wrote:

...

Ok, B it is. I'll add another entry to updaters.inc when I get home and start by first converting all uses of getText in User.php to getDBkey. After the actual title stuff is built, we can track down all the places which use a displayable version of the name and make them use the displayname instead.

...

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Simetrical wrote:

...
On Thu, Mar 6, 2008 at 2:43 AM, DanTMan dan_the_man@telus.net wrote:

...
B) Make use of getDBkey for identification of the user and have the update script refactor the users table to use underscores like it should instead of spaces.

The idea of having separate normalized/display names makes as much sense for users as for titles, certainly. This seems like the more logical option. It's not like we aren't going to have be doing rebuilding and repopulating of the page table to do this anyway, so why not the user table too?

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Simetrical

12 Mar 12 Mar

8:56 a.m.

On Sat, Mar 8, 2008 at 10:34 PM, DanTMan dan_the_man@telus.net wrote:

...

We've got a few options here: A) Create a new function getTitle which returns the title object which the User matches, and make use of it's functions for the standard displays and other things. Of course this is bascialy the same as getUserPage.

I don't like this. There's no reason username normalization can't be stricter than title normalization, for instance. (It had better not be less strict, of course, if you don't want user page collisions.) In general I don't like mixing up titles with users. Of course in practice the backend may use title normalization for users as well, but that fact should all be hidden in a User method, not exposed to callers.

...

B) Change getName's definition to be the display form of a user's name, and getTitleKey to be the key form of the user's name. And change the large number of comparison functions inside of MediaWiki to use getTitleKey instead of getName.

getTitleKey sounds like a poor name to me.

...

C) Create a new function getDisplayName, and have getName's definition changed to the key form of the user's name, and getDisplayName as the display form of the user's name. Depreciate the use of getTitleKey because of it's lack of use or need anymore (changing a single reference to it). And change the uses of getName as a display value inside of MediaWiki to use getDisplayName instead.

getName is then ambiguous: which name are you talking about? getNormalizedName or getNameKey or something would be better.

...

D) Create two new functions for the key and display forms of the user's name. And depreciate the old functions, slowly changing use of them to the new functions in MW to keep compatibility.

. . . so I would go for (D).

...

I'm probably in favor of C, as if it's definition is changed to key form, all the backend testing and stuff will still work fine, and we then will only need to worry about changing the areas that the username is used for display. Which won't really be a problem if we miss anything, because if we end up missing one conversion, the system will simply be displaying a semi-ugly form with underscores inside of it and none of the case stuff picked by the user. And it'll still be likable and won't break anything in the backend.

Your logic is good, and applies equally to (D): alias getName to getNormalizedName or whatever, rather than getDisplayName.

...

Oh, ^_^ an interesting new ability due to switching to key form: SELECT user_editcount FROM `user`, `page` WHERE user_name=page_title AND page_namespace=2 AND page_id=3 In this example, page_id 3 is the userpage of my user's userpage. What does it do? Well, if you were on the userpage of a user and just had the page ID, you could now easily grab any information from the database on that user himself because page_name and user_name are stored in the same format instead of different formats.

Interesting. Of course, my suggestion above that we don't rely on user and title normalization being the same would break this. I don't know if there's any good reason to go either way here.

...

Oh, minor off topic. But what about putting a log_user_text in at some point. Honnestly I know of a few extensions which intended to allow certain things to be done by anons in addition to normal users, but which broke when anons were allowed use of them because anon users were not properly logged.

Yes, please! That's been to-do for ages.

DanTMan

10:22 a.m.

Mkay, D it is... getName will be depreciated... To go with the whole key/real namescheme I've been going with in Title.php a new getRealName function will get the name to use for interface display. And to match that, getKeyName will get the name for use in uniqueness checking and comparison, and getName will be aliased to it.

^_^ Actually about your note on User and Title normalization not being the same. There is no real reason for them not to be (With the exception of the stuff that we stick in functions like isValidName)... Why's that? A little bonus I already theorized but never mentioned (I'm good at grasping a lot of theory and wrapping my mind around how things work and are supposed to, so I get a lot of them) Because of the new extensible normalization, and how all the username stuff relies on getDBkey and directly uses getText for displaying the username, there is a little bonus. If you go and extend the normalization of Titles specifically for the User: namespace (remember that because of the way it's setup, you can now create per-namespace normalization), the normalization of Usernames will be directly affected by it (Which is kinda why I needed to alter User.php because of that login bug). So, if you go and make the User: namespace completely case-insensitive and leave other namespaces the way they are, the Usernames will suddenly all become completely case-insensitive to match that, without altering any normalization code for usernames.

Btw: I have a function inside of the normalizer. TitleNormalizer::backconvert( $title ); basically it does the normal replacing of underscores with spaces. The point of it is for when we don't have a page_real stored in the database (ie: nonexistant page), then backconvert will be used to create a temporary title for displaying while the page doesn't exist. Of course, there is a hook inside of it which lets extensions override it in case they do something like changing the ' ' to '_' normalization to ' ' to '-' for some reason.

Heh, I guess I'll take a look at that log stuff sometime later to see how easy it will be.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Simetrical wrote:

...

On Sat, Mar 8, 2008 at 10:34 PM, DanTMan dan_the_man@telus.net wrote:

...
We've got a few options here: A) Create a new function getTitle which returns the title object which the User matches, and make use of it's functions for the standard displays and other things. Of course this is bascialy the same as getUserPage.

I don't like this. There's no reason username normalization can't be stricter than title normalization, for instance. (It had better not be less strict, of course, if you don't want user page collisions.) In general I don't like mixing up titles with users. Of course in practice the backend may use title normalization for users as well, but that fact should all be hidden in a User method, not exposed to callers.

...
B) Change getName's definition to be the display form of a user's name, and getTitleKey to be the key form of the user's name. And change the large number of comparison functions inside of MediaWiki to use getTitleKey instead of getName.

getTitleKey sounds like a poor name to me.

...
C) Create a new function getDisplayName, and have getName's definition changed to the key form of the user's name, and getDisplayName as the display form of the user's name. Depreciate the use of getTitleKey because of it's lack of use or need anymore (changing a single reference to it). And change the uses of getName as a display value inside of MediaWiki to use getDisplayName instead.

getName is then ambiguous: which name are you talking about? getNormalizedName or getNameKey or something would be better.

...
D) Create two new functions for the key and display forms of the user's name. And depreciate the old functions, slowly changing use of them to the new functions in MW to keep compatibility.

. . . so I would go for (D).

...
I'm probably in favor of C, as if it's definition is changed to key form, all the backend testing and stuff will still work fine, and we then will only need to worry about changing the areas that the username is used for display. Which won't really be a problem if we miss anything, because if we end up missing one conversion, the system will simply be displaying a semi-ugly form with underscores inside of it and none of the case stuff picked by the user. And it'll still be likable and won't break anything in the backend.

Your logic is good, and applies equally to (D): alias getName to getNormalizedName or whatever, rather than getDisplayName.

...
Oh, ^_^ an interesting new ability due to switching to key form: SELECT user_editcount FROM `user`, `page` WHERE user_name=page_title AND page_namespace=2 AND page_id=3 In this example, page_id 3 is the userpage of my user's userpage. What does it do? Well, if you were on the userpage of a user and just had the page ID, you could now easily grab any information from the database on that user himself because page_name and user_name are stored in the same format instead of different formats.

Interesting. Of course, my suggestion above that we don't rely on user and title normalization being the same would break this. I don't know if there's any good reason to go either way here.

...
Oh, minor off topic. But what about putting a log_user_text in at some point. Honnestly I know of a few extensions which intended to allow certain things to be done by anons in addition to normal users, but which broke when anons were allowed use of them because anon users were not properly logged.

Yes, please! That's been to-do for ages.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Simetrical

6:06 p.m.

On Wed, Mar 12, 2008 at 11:22 AM, DanTMan dan_the_man@telus.net wrote:

...

getName will be depreciated... To go with the whole key/real namescheme I've been going with in Title.php a new getRealName function will get the name to use for interface display. And to match that, getKeyName will get the name for use in uniqueness checking and comparison, and getName will be aliased to it.

Are "RealName" and "KeyName" the best terms to use? We already use "Text" and "DBKey" for titles, but I recall that confused me somewhat for a while. I would probably have done "DisplayName" and "NormalizedName", but that might not be ideal either. We may as well think about this now instead of being stuck with weird names forever.

...

^_^ Actually about your note on User and Title normalization not being the same. There is no real reason for them not to be (With the exception of the stuff that we stick in functions like isValidName)... Why's that? A little bonus I already theorized but never mentioned (I'm good at grasping a lot of theory and wrapping my mind around how things work and are supposed to, so I get a lot of them) Because of the new extensible normalization, and how all the username stuff relies on getDBkey and directly uses getText for displaying the username, there is a little bonus. If you go and extend the normalization of Titles specifically for the User: namespace (remember that because of the way it's setup, you can now create per-namespace normalization), the normalization of Usernames will be directly affected by it (Which is kinda why I needed to alter User.php because of that login bug).

Oh, that's very neat. It preserves a one-to-one correspondence between usernames and User-namespace titles -- almost. Are you going to do stuff like ban '@' and other things not allowed in usernames from the namespace? That would make it a perfect bijection between User pages and user names-plus-IP addresses.

...

Btw: I have a function inside of the normalizer. TitleNormalizer::backconvert( $title ); basically it does the normal replacing of underscores with spaces. The point of it is for when we don't have a page_real stored in the database (ie: nonexistant page), then backconvert will be used to create a temporary title for displaying while the page doesn't exist. Of course, there is a hook inside of it which lets extensions override it in case they do something like changing the ' ' to '_' normalization to ' ' to '-' for some reason.

Hmm, I see. When would this be a concern? Shouldn't the page_real be generated from the URL? I guess not exactly, if link targets are normalized. I'm thinking if the user types, I dunno, "str_repeat" into the search box, they should get links asking them to edit "str_repeat", not "str repeat" or any other variant. The same should apply to ordinary wikilinks, ideally -- but on the other hand, non-broken wikilinks should still point to prettified locations. So I guess this would require [[has space]] to translate to ?title=Has_space (or whatever normalized form) if it exists, but ?title=has%20space&action=edit if it doesn't. Which isn't perfect. But I don't see any other way to achieve the effect.

DanTMan

11:11 p.m.

I'm not a fan of "Display". While it is used in normal interface it's not necessarily the actual title that will be displayed inside the title header. Remember that there is a third form if the wiki is supporting the upgraded DISPLAYTITLE, which by default will be enabled ^_^ in a semi-neutered state (lol) where it functions like the standard DISPLAYTITLE for backwards compatibility. But over top of that, there are many wiki which are likely to enable an option or an extension which allows less strict display titles, as well as some purely extension driven ones.

So, I think Display should be reserved for what goes inside of the header, rather than the un-normalized name of the title.

So potential words: (Normalized form for inside the database) Key, DB Key, Unique, Normalized (Non-Normalized form for interface use) Real, Text, Interface, Display, UI, UnNormalized (Format generated by extensions or other things for use in the title bar) Display, Generated

No, @ should never be banned from the namespace. It's ok to make it invalid for creation and use, however it's not ok to kill any User namespace title using it. Remember that some things like the idea of Transwiki imports appending the wiki they came from after an @. Those still link to names inside the user namespace. It would be best to keep @ valid so that those can be created with information or such on that user from the other wiki.

Hmmm... well, there are two things to actually consider with the format of the name. Firstly, is that most of the stuff inside of MediaWiki dealing with titles, and many of them the ones which deal with broken links as well, are likely to create Title objects using factory functions which make use of the key format, NOT the real format. And most of them do that for good reasons which should not be altered. So, a high percentage of the time you have a Title object generated by MediaWiki itself, which will have no concept or input of what a real title is. Secondly, there are many broken links which are created in special pages. Ones such as Wantedpages, Broken redirects, etc... None of these will ever have a real format inside the database, and there is no title attribute to attempt use of. Thirdly has to do with the editing (Honestly I haven't been thinking much about that, more focused on [[Special:Movepage]]). There are a few issues with directly using the title made by the url. * External sites may actually be using the format which search engines typically have used, but may be in use in other systems as well. In this way, we may actually have titles such as [[Main+Page]], which will insert a literal + inside the title, which we don't actually want. * While we can alter the way that links are generated for the current site, we cannot guarantee anything to do with another wiki. While we do have the user case stuff for interwiki which allows for keeping the case of the first letter. There's no guarantee that another wiki won't use a link like [[Wikipedia:Some Page]], and as a result will link to http://en.wikipedia.org/wiki/Some_Page and that view page would then have the _ inside of it, and a user who hits edit would then end up creating a page with an _ inside the title. * Don't forget us users who don't use the search box, or preview link methods, but instead use the address bar altering method. It's highly likely there are editors who will edit the address bar and use a _ to get where they want to go for page editing. And it's not good for this to be reflected inside of the title. And there's no guarantee on what the browser will do when we type actual characters inside the address bar, and we shouldn't need to memorize percent encodings for characters.

Actually, I'm thinking the best method may actually be the addition of a new input to the edit page for that edit issue. Basically a Title input would appear above the textbox (Where the section title is currently located) when editing a new page. This would by default contain the back converted title (To avoid any possible issue, including a user using a title with no first case capital when they should be using it), and the user could edit it to reformat the title. While we could make it so that the the title there is required to normalize back to the normal title. Why not kill two birds with one stone? Instead of that, use that input as the actual title. This will even kill the normal confusion that a new editor encounters when they don't know how to create a new page. All you'll need to do is go to any new page with an &action=edit, even using something like http://en.wikipedia.org/w/?action=edit or use http://en.wikipedia.org/w/?action=create to clear it up, type in the title of the page into the input, and save it. Instead of mucking with other stuff, and extensions like the Inputbox. Of course, if that page already exists, then you'll simply get an error telling you it already exists, and you should either edit that page or find a new title. Now as for the redlinks and getting a real form through the url. I'd propose a secondary parameter, something like &titlesuggest= or something. Which a redlink would append to keep the formatting of the current title. As well as extensions like Inputbox could do. As for the current existing pages. We would probably leave that out and use Special:Movepage to do actual movement. It's possible that in the future someone could rewrite the edit system and move system to allow for a setup where you can both move, and edit the page text at the same time. However, that is a rewrite in an area which we probably should not attempt inside the scope of the current title rewrite. That can actually be done on it's own later without requiring any work here. ---- I should also make a note on caching, and what effect that has on the title. I'm actually using the LinkCache to generate the Real name from the database. Why? Title.php never, ever uses a database query inside of a get function to get information on a title. The only database queries inside of there are part of things like the factory constructors dealing with page ids. The only other getter which needs database info, is getArticleId. And that makes a query to the LinkCache, which does the database query there on it's own, and returns that, as well as putting the link into the cache to avoid any new queries.

Because of that, I went for making use of the LinkCache for the getting of the real title. Of course there are a few issues. Firstly, because this is being stored in the cache, we should NEVER store a user inputed real title. For this reason, before the LinkCache stores the title in it's cache, it actually resets the real title using that database query, and if it doesn't find it, then it backconverts it. So it's not a good idea to use user supplied titles here. So that has two side effects. Firstly, if we were to make it so that the LinkCache just didn't set the real title when it doesn't find one... We are likely to end up with a Wiki error resulting from getText not normalizing to getDBkey at some point. Or even worse, there could be a small possibility of an infinite loop where something which needs that real title may query again until it gets it, which would never happen. Another affect could be the fact that since it does not have a real title stored, the LinkCache would be queried for a new real title every time you call getText or something else which depends on it. This would be a heavy database burden resulting from LinkCache not setting a fallback title. Secondly, just as another result, but if anyone uses getArticleId on your title object, there is the possibility that the real title will be reset. (I've made a warning of this inside the setter, that setter is meant strictly for temporary use where you know what is happening to the title. Primarily it is only used by the LinkCache firstly to actually set the title (Since it isn't from the same object and it's bad practice to externally edit a member variable, that could change names, and also because it would generate PHP errors if we later re-factored things to use actual private variables.), and also will be used when moving a page to set what new real title to move to)

From my detailed look at Title.php there is little other way to do this, which does not have a serious effect on the database, or bad coding which will result in a lot of bugs.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Simetrical wrote:

...

On Wed, Mar 12, 2008 at 11:22 AM, DanTMan dan_the_man@telus.net wrote:

...
getName will be depreciated... To go with the whole key/real namescheme I've been going with in Title.php a new getRealName function will get the name to use for interface display. And to match that, getKeyName will get the name for use in uniqueness checking and comparison, and getName will be aliased to it.

Are "RealName" and "KeyName" the best terms to use? We already use "Text" and "DBKey" for titles, but I recall that confused me somewhat for a while. I would probably have done "DisplayName" and "NormalizedName", but that might not be ideal either. We may as well think about this now instead of being stuck with weird names forever.

...
^_^ Actually about your note on User and Title normalization not being the same. There is no real reason for them not to be (With the exception of the stuff that we stick in functions like isValidName)... Why's that? A little bonus I already theorized but never mentioned (I'm good at grasping a lot of theory and wrapping my mind around how things work and are supposed to, so I get a lot of them) Because of the new extensible normalization, and how all the username stuff relies on getDBkey and directly uses getText for displaying the username, there is a little bonus. If you go and extend the normalization of Titles specifically for the User: namespace (remember that because of the way it's setup, you can now create per-namespace normalization), the normalization of Usernames will be directly affected by it (Which is kinda why I needed to alter User.php because of that login bug).

Oh, that's very neat. It preserves a one-to-one correspondence between usernames and User-namespace titles -- almost. Are you going to do stuff like ban '@' and other things not allowed in usernames from the namespace? That would make it a perfect bijection between User pages and user names-plus-IP addresses.

...
Btw: I have a function inside of the normalizer. TitleNormalizer::backconvert( $title ); basically it does the normal replacing of underscores with spaces. The point of it is for when we don't have a page_real stored in the database (ie: nonexistant page), then backconvert will be used to create a temporary title for displaying while the page doesn't exist. Of course, there is a hook inside of it which lets extensions override it in case they do something like changing the ' ' to '_' normalization to ' ' to '-' for some reason.

Hmm, I see. When would this be a concern? Shouldn't the page_real be generated from the URL? I guess not exactly, if link targets are normalized. I'm thinking if the user types, I dunno, "str_repeat" into the search box, they should get links asking them to edit "str_repeat", not "str repeat" or any other variant. The same should apply to ordinary wikilinks, ideally -- but on the other hand, non-broken wikilinks should still point to prettified locations. So I guess this would require [[has space]] to translate to ?title=Has_space (or whatever normalized form) if it exists, but ?title=has%20space&action=edit if it doesn't. Which isn't perfect. But I don't see any other way to achieve the effect.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Simetrical

13 Mar 13 Mar

9:11 a.m.

On Thu, Mar 13, 2008 at 12:11 AM, DanTMan dan_the_man@telus.net wrote:

...

I'm not a fan of "Display". While it is used in normal interface it's not necessarily the actual title that will be displayed inside the title header. Remember that there is a third form if the wiki is supporting the upgraded DISPLAYTITLE, which by default will be enabled ^_^ in a semi-neutered state (lol) where it functions like the standard DISPLAYTITLE for backwards compatibility. But over top of that, there are many wiki which are likely to enable an option or an extension which allows less strict display titles, as well as some purely extension driven ones.

So, I think Display should be reserved for what goes inside of the header, rather than the un-normalized name of the title.

Okay, reasonable.

...

So potential words: (Normalized form for inside the database) Key, DB Key, Unique, Normalized (Non-Normalized form for interface use) Real, Text, Interface, Display, UI, UnNormalized (Format generated by extensions or other things for use in the title bar) Display, Generated

Hmm. Nothing leaps out at me as being the best.

...

No, @ should never be banned from the namespace. It's ok to make it invalid for creation and use, however it's not ok to kill any User namespace title using it. Remember that some things like the idea of Transwiki imports appending the wiki they came from after an @. Those still link to names inside the user namespace. It would be best to keep @ valid so that those can be created with information or such on that user from the other wiki.

I meant generically, that user pages should correspond exactly to usable usernames. This needs to be in a loose sense, I guess: anything that can show up in a user_text field should be valid. That may include cross-wiki usernames, for instance, which could contain @.

...

Hmmm... well, there are two things to actually consider with the format of the name. . . . Why not kill two birds with one stone? Instead of that, use that input as the actual title. This will even kill the normal confusion that a new editor encounters when they don't know how to create a new page. All you'll need to do is go to any new page with an &action=edit, even using something like http://en.wikipedia.org/w/?action=edit or use http://en.wikipedia.org/w/?action=create to clear it up, type in the title of the page into the input, and save it. Instead of mucking with other stuff, and extensions like the Inputbox. Of course, if that page already exists, then you'll simply get an error telling you it already exists, and you should either edit that page or find a new title.

Now that's an interesting idea. It would solve the problem neatly.

...

Now as for the redlinks and getting a real form through the url. I'd propose a secondary parameter, something like &titlesuggest= or something. Which a redlink would append to keep the formatting of the current title. As well as extensions like Inputbox could do.

Well, if it's a new action=create, then you can just use the title parameter, since it will be unused.

...

As for the current existing pages. We would probably leave that out and use Special:Movepage to do actual movement. It's possible that in the future someone could rewrite the edit system and move system to allow for a setup where you can both move, and edit the page text at the same time. However, that is a rewrite in an area which we probably should not attempt inside the scope of the current title rewrite. That can actually be done on it's own later without requiring any work here.

Of course. I don't know if such a thing would be desirable.

...

Because of that, I went for making use of the LinkCache for the getting of the real title. Of course there are a few issues. Firstly, because this is being stored in the cache, we should NEVER store a user inputed real title. For this reason, before the LinkCache stores the title in it's cache, it actually resets the real title using that database query, and if it doesn't find it, then it backconverts it. So it's not a good idea to use user supplied titles here. So that has two side effects. Firstly, if we were to make it so that the LinkCache just didn't set the real title when it doesn't find one... We are likely to end up with a Wiki error resulting from getText not normalizing to getDBkey at some point. Or even worse, there could be a small possibility of an infinite loop where something which needs that real title may query again until it gets it, which would never happen. Another affect could be the fact that since it does not have a real title stored, the LinkCache would be queried for a new real title every time you call getText or something else which depends on it. This would be a heavy database burden resulting from LinkCache not setting a fallback title. Secondly, just as another result, but if anyone uses getArticleId on your title object, there is the possibility that the real title will be reset. (I've made a warning of this inside the setter, that setter is meant strictly for temporary use where you know what is happening to the title. Primarily it is only used by the LinkCache firstly to actually set the title (Since it isn't from the same object and it's bad practice to externally edit a member variable, that could change names, and also because it would generate PHP errors if we later re-factored things to use actual private variables.), and also will be used when moving a page to set what new real title to move to)

From my detailed look at Title.php there is little other way to do this, which does not have a serious effect on the database, or bad coding which will result in a lot of bugs.

I'm not sure I get what you're saying. I'll have to look this over in more detail later to offer suggestions.

DanTMan

9:13 p.m.

Hmmmm, now that I think about it... While the term "Real" is good for titles, it doesn't work for usernames. Because we already have a "Real Name" for the user, which is something completely different. (I don't believe we've even used it in any features yet O_o... One of these days I've gotta alter the system to display the real name field instead of the user's username inside of the personal links).

Hmmm... though, redlinks actually use editredlink rather than edit. Ok then, &action=create will be a overriding action. Rather than being like a simple alias of edit which is meant for creation, it'll use an edit form, but it'll never throw an error if the suggested title already exists when you go to the page (Of course it will if you try to save over top of a page, and we'll probably give the user notification that a title they are using is already in use... Probably on each submit to the form (ie: Preview, etc...) and through some AJAX for those who have it)

I think Tim Starling was the one who did the editredlink change. Could we get some input from him on if he minds changing the dedicated editredlink to something like a &muffleerrors=1 which would be used in the form &action=edit&muffleerrors=1 to allow &action=create&muffleerrors=1 to be used. Or something... Cause "editredlink" doesn't fit right when &action=create may come from many other sources. And that includes redlinks, inputboxes, and it's likely also going to be used when the user hits the "edit" tab (Actually if no-one minds, it would be a beautiful opportunity to change the "edit" tab on non-existent pages into a "create" tab for a more intuitive interface)

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Simetrical wrote:

...

On Thu, Mar 13, 2008 at 12:11 AM, DanTMan dan_the_man@telus.net wrote:

...
I'm not a fan of "Display". While it is used in normal interface it's not necessarily the actual title that will be displayed inside the title header. Remember that there is a third form if the wiki is supporting the upgraded DISPLAYTITLE, which by default will be enabled ^_^ in a semi-neutered state (lol) where it functions like the standard DISPLAYTITLE for backwards compatibility. But over top of that, there are many wiki which are likely to enable an option or an extension which allows less strict display titles, as well as some purely extension driven ones.

So, I think Display should be reserved for what goes inside of the header, rather than the un-normalized name of the title.

Okay, reasonable.

...
So potential words: (Normalized form for inside the database) Key, DB Key, Unique, Normalized (Non-Normalized form for interface use) Real, Text, Interface, Display, UI, UnNormalized (Format generated by extensions or other things for use in the title bar) Display, Generated

Hmm. Nothing leaps out at me as being the best.

...
No, @ should never be banned from the namespace. It's ok to make it invalid for creation and use, however it's not ok to kill any User namespace title using it. Remember that some things like the idea of Transwiki imports appending the wiki they came from after an @. Those still link to names inside the user namespace. It would be best to keep @ valid so that those can be created with information or such on that user from the other wiki.

I meant generically, that user pages should correspond exactly to usable usernames. This needs to be in a loose sense, I guess: anything that can show up in a user_text field should be valid. That may include cross-wiki usernames, for instance, which could contain @.

...
Hmmm... well, there are two things to actually consider with the format of the name. . . . Why not kill two birds with one stone? Instead of that, use that input as the actual title. This will even kill the normal confusion that a new editor encounters when they don't know how to create a new page. All you'll need to do is go to any new page with an &action=edit, even using something like http://en.wikipedia.org/w/?action=edit or use http://en.wikipedia.org/w/?action=create to clear it up, type in the title of the page into the input, and save it. Instead of mucking with other stuff, and extensions like the Inputbox. Of course, if that page already exists, then you'll simply get an error telling you it already exists, and you should either edit that page or find a new title.

Now that's an interesting idea. It would solve the problem neatly.

...
Now as for the redlinks and getting a real form through the url. I'd propose a secondary parameter, something like &titlesuggest= or something. Which a redlink would append to keep the formatting of the current title. As well as extensions like Inputbox could do.

Well, if it's a new action=create, then you can just use the title parameter, since it will be unused.

...
As for the current existing pages. We would probably leave that out and use Special:Movepage to do actual movement. It's possible that in the future someone could rewrite the edit system and move system to allow for a setup where you can both move, and edit the page text at the same time. However, that is a rewrite in an area which we probably should not attempt inside the scope of the current title rewrite. That can actually be done on it's own later without requiring any work here.

Of course. I don't know if such a thing would be desirable.

...
Because of that, I went for making use of the LinkCache for the getting of the real title. Of course there are a few issues. Firstly, because this is being stored in the cache, we should NEVER store a user inputed real title. For this reason, before the LinkCache stores the title in it's cache, it actually resets the real title using that database query, and if it doesn't find it, then it backconverts it. So it's not a good idea to use user supplied titles here. So that has two side effects. Firstly, if we were to make it so that the LinkCache just didn't set the real title when it doesn't find one... We are likely to end up with a Wiki error resulting from getText not normalizing to getDBkey at some point. Or even worse, there could be a small possibility of an infinite loop where something which needs that real title may query again until it gets it, which would never happen. Another affect could be the fact that since it does not have a real title stored, the LinkCache would be queried for a new real title every time you call getText or something else which depends on it. This would be a heavy database burden resulting from LinkCache not setting a fallback title. Secondly, just as another result, but if anyone uses getArticleId on your title object, there is the possibility that the real title will be reset. (I've made a warning of this inside the setter, that setter is meant strictly for temporary use where you know what is happening to the title. Primarily it is only used by the LinkCache firstly to actually set the title (Since it isn't from the same object and it's bad practice to externally edit a member variable, that could change names, and also because it would generate PHP errors if we later re-factored things to use actual private variables.), and also will be used when moving a page to set what new real title to move to)

From my detailed look at Title.php there is little other way to do this, which does not have a serious effect on the database, or bad coding which will result in a lot of bugs.

I'm not sure I get what you're saying. I'll have to look this over in more detail later to offer suggestions.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Simetrical

14 Mar 14 Mar

8:24 a.m.

On Thu, Mar 13, 2008 at 10:13 PM, DanTMan dan_the_man@telus.net wrote:

...

Hmmm... though, redlinks actually use editredlink rather than edit.

No, it uses &action=edit&redlink=1. I don't see any conflict.

...

Ok then, &action=create will be a overriding action. Rather than being like a simple alias of edit which is meant for creation, it'll use an edit form, but it'll never throw an error if the suggested title already exists when you go to the page

Why not? Of course it shouldn't be an error that prevents you from continuing, but there should be a clear notice that an article by that name already exists and you may want to edit it instead.

...

(Actually if no-one minds, it would be a beautiful opportunity to change the "edit" tab on non-existent pages into a "create" tab for a more intuitive interface)

That's an easy change, someone could do that right now. In fact I've committed it in r31973.

DanTMan

8:05 p.m.

Heh... I thought he said something about it being editredlink, not &action=edit&redlink=1, ok, that's fine... It'll just end up as &action=create&redlink=1 instead, and the title will be a suggest not hard.

Hmmm... actually, perhaps we should add a bit more restriction to &redlink=1. It's quite possible for someone to click a redlink after a page has already been created. And when someone clicks a redlink to create a page, they are expecting to create that page, not another. So perhaps when coming from a redlink we should add in an extra message if the page already exists saying something like "A page with this title has recently been created. It's likely that you want to edit that page rather than create a new one. Do you wish to switch to editing that page instead of creating a new one under a different title?" And give them a link to switch to that page's edit mode instead.

Never said that the user would never be notified: "and we'll probably give the user notification that a title they are using is already in use... Probably on each submit to the form (ie: Preview, etc...) and through some AJAX for those who have it"; But when a user goes to a &action=create page, we should probably assume they are expecting to create something new rather than expecting to edit something that already exists.

~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)

Simetrical wrote:

...

On Thu, Mar 13, 2008 at 10:13 PM, DanTMan dan_the_man@telus.net wrote:

...
Hmmm... though, redlinks actually use editredlink rather than edit.

No, it uses &action=edit&redlink=1. I don't see any conflict.

...
Ok then, &action=create will be a overriding action. Rather than being like a simple alias of edit which is meant for creation, it'll use an edit form, but it'll never throw an error if the suggested title already exists when you go to the page

Why not? Of course it shouldn't be an error that prevents you from continuing, but there should be a clear notice that an article by that name already exists and you may want to edit it instead.

...
(Actually if no-one minds, it would be a beautiful opportunity to change the "edit" tab on non-existent pages into a "create" tab for a more intuitive interface)

That's an easy change, someone could do that right now. In fact I've committed it in r31973.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Platonides

15 Mar 15 Mar

5:55 p.m.

DanTMan wrote:

...

It's quite possible for someone to click a redlink after a page has already been created. And when someone clicks a redlink to create a page, they are expecting to create that page, not another. So perhaps when coming from a redlink we should add in an extra message if the page already exists saying something like "A page with this title has recently been created. It's likely that you want to edit that page rather than create a new one. Do you wish to switch to editing that page instead of creating a new one under a different title?" And give them a link to switch to that page's edit mode instead.

No, if it has been created, they want to view what has been created in order to decide if it's fine as it or if they want to edit it further.

Simetrical

8 p.m.

On Fri, Mar 14, 2008 at 9:05 PM, DanTMan dan_the_man@telus.net wrote:

...

It's quite possible for someone to click a redlink after a page has already been created. And when someone clicks a redlink to create a page, they are expecting to create that page, not another. So perhaps when coming from a redlink we should add in an extra message if the page already exists saying something like "A page with this title has recently been created. It's likely that you want to edit that page rather than create a new one. Do you wish to switch to editing that page instead of creating a new one under a different title?" And give them a link to switch to that page's edit mode instead.

Someone clicking a red link wants to either

1) read the page (not knowing it doesn't exist), or

2) create the page (thinking it doesn't exist).

In case 1 the best behavior would be to redirect the user to the page as though it were a blue link. In case 2 the best behavior is unclear, since the user's expectations are wrong, and so actually if we do any special-casing, it may as well be a 301 redirect to the existing page as though it were a blue link. But I wouldn't waste too many brain cells on this, since it's a fairly uncommon scenario.

subscribe＠divog.com.ru

1 Mar 1 Mar

4:42 a.m.

...

Explicitly, yeah, but any associative array using title strings as keys will automatically be case-sensitive, just because array lookups (and string comparisons generally) are case-sensitive. I have no idea how many of those there are scattered about.

Is there many of them - such things? The only one I found was LinkCache class. Parser, Linker, Title use only methods of LinkCache, when it's about Good|BadLinks. Maybe there are no other cases of use title string as keys of associative array?

subscribe＠divog.com.ru

23 Jun 23 Jun

3:28 p.m.

Hi! Would you (anybody) tell me about progress on doing the case insensitive links? Can I presently make red (broken) links blue when target article is present, only case of link and article is different? Also what if those links were pointing not to action=edit, but to viewing the article?

Thanks.

DanTMan

5:41 p.m.

That would be my titlerewrite branch's purpose.

Unfortunately, due to my screwing up of the original branch (I can't even remember how I did it originally, and it was largely broken in that aspect), and that svnmerge is not working for me atm (svn+ssh only works through TortoiseSVN for me, unfortunately I don't have a bash compatible version of my ssh key) the titlerewrite branch is currently merely a flat branch of trunk. Also, unfortunately I've finally pushed myself into actually focusing on a single project for once in my life. That would be my ElectronicMe Profile/Portfolio site app, so the titlerewrite branch is a largely broken aspect right now till I get back on it. The old code (which I still have) is live on http://titlerewrite.dev.wiki-tools.com/ but as you can see from there, the code is largely broken and needs much more work. The primary starting point was the page_title_ui field, and the schema change from usernames as text to usernames as keys.

~Daniel Friesen(Dantman) of: -The Nadir-Point Group (http://nadir-point.com) --It's Wiki-Tools subgroup (http://wiki-tools.com) --Games-G.P.S. (http://ggps.org) -And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)

subscribe@divog.com.ru wrote:

...

Hi! Would you (anybody) tell me about progress on doing the case insensitive links? Can I presently make red (broken) links blue when target article is present, only case of link and article is different? Also what if those links were pointing not to action=edit, but to viewing the article?

Thanks.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

6037

Age (days ago)

6153

Last active (days ago)

wikitech-l@lists.wikimedia.org

36 comments

7 participants

tags (0)

participants (7)

DanTMan
David Gerard
Platonides
Simetrical
subscribe＠divog.com.ru
T.W.A.Maaswinkel＠rn.rabobank.nl
Thomas Bleher