I'm starting a new intranet-style wiki, and I'd like to make the wiki links fully case-insensitive.
I've read various discussions about the pros and cons of case-sensitivity. What I'm not entirely clear about is what would need to be changed (at the technical level) to achieve it. From what I've read, it may be that the chief technical problem is dealing with existing pages whose title differs only in case. Since I'm starting from scratch, that wouldn't be a concern for me.
Is there some well-known bit of code that needs to be modified to achieve case-insensitive links for a new wiki? Or is this something that needs to be figured out yet?
(Ryan Rempel rgrempel@gmail.com): I'm starting a new intranet-style wiki, and I'd like to make the wiki links fully case-insensitive.
You should be able to get away with only changing a few things in Title.php, to make the DB form, display form, and URL form all case folded. It should still be OK to put both cases in the link itself, as long as Title.php folds in before lookup and when creating links.
How this reacts with changes that may have been made since I had my hands in the code, I make no promises...
Lee Daniel Crocker wrote:
(Ryan Rempel rgrempel@gmail.com): I'm starting a new intranet-style wiki, and I'd like to make the wiki links fully case-insensitive.
You should be able to get away with only changing a few things in Title.php, to make the DB form, display form, and URL form all case folded.
If you want to force all titles to all-lowercase that might work well enough. Search for instances of $wgCapitalLinks for the code that does the first-letter normalization to find places you'd want to change to do it this way.
This wouldn't be sufficient to get a decent user experience in general (eg for any of Wikimedia's sites); titles pulled from the database would only be available in the folded (eg lowercase) form, but most people wouldn't be too happy about article titles like "washington, d.c." and "ibm".
Making it work well would require storing the display form as well in a few places, which would require schema changes, and some additional provsions for changing the display form, etc.
Ryan Rempel wrote:
From what I've read, it may be that the chief technical problem is dealing with existing pages whose title differs only in case. Since I'm starting from scratch, that wouldn't be a concern for me.
No, that's a one-time conversion issue which would just require some brute force.
The chief technical problem is getting unique case-insensitive matching *while* preserving the given case of page titles in a sane, consistent way.
Adding a title display form field to the page table probably would get partway there, but not all; dealing sanely with pages that don't exist yet may take additional work.
-- brion vibber (brion @ pobox.com)
On Sat, 26 Mar 2005 23:25:08 -0800, Brion Vibber brion@pobox.com wrote:
Lee Daniel Crocker wrote:
(Ryan Rempel rgrempel@gmail.com): I'm starting a new intranet-style wiki, and I'd like to make the wiki links fully case-insensitive.
You should be able to get away with only changing a few things in Title.php, to make the DB form, display form, and URL form all case folded.
If you want to force all titles to all-lowercase that might work well enough. Search for instances of $wgCapitalLinks for the code that does the first-letter normalization to find places you'd want to change to do it this way.
This wouldn't be sufficient to get a decent user experience in general (eg for any of Wikimedia's sites); titles pulled from the database would only be available in the folded (eg lowercase) form, but most people wouldn't be too happy about article titles like "washington, d.c." and "ibm".
Making it work well would require storing the display form as well in a few places, which would require schema changes, and some additional provsions for changing the display form, etc.
Ah, I wouldn't want to force all the titles to lowercase -- I guess I meant case-insensitive and case-preserving (or something like that). I'll have to think through exactly what I want and take a look at the code -- the implications will probably be clearer to me once I have more experience.
In LocalSettings.php, add the line:
$wgCapitalLinks = false;
On Sat, 26 Mar 2005 22:00:18 -0600, Ryan Rempel rgrempel@gmail.com wrote:
I'm starting a new intranet-style wiki, and I'd like to make the wiki links fully case-insensitive.
I've read various discussions about the pros and cons of case-sensitivity. What I'm not entirely clear about is what would need to be changed (at the technical level) to achieve it. From what I've read, it may be that the chief technical problem is dealing with existing pages whose title differs only in case. Since I'm starting from scratch, that wouldn't be a concern for me.
Is there some well-known bit of code that needs to be modified to achieve case-insensitive links for a new wiki? Or is this something that needs to be figured out yet? _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Jamie Bliss wrote:
In LocalSettings.php, add the line:
$wgCapitalLinks = false;
That makes the first letter of page titles case-*sensitive*. That's the opposite of case-insensitive. :)
-- brion vibber (brion @ pobox.com)
oops, my bad. :embaressed: I misread that.
On Mon, 28 Mar 2005 15:53:22 -0800, Brion Vibber brion@pobox.com wrote:
Jamie Bliss wrote:
In LocalSettings.php, add the line:
$wgCapitalLinks = false;
That makes the first letter of page titles case-*sensitive*. That's the opposite of case-insensitive. :)
-- brion vibber (brion @ pobox.com)
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
I've been working on this a bit, and thought I'd let you know what I've come up with. I should say at the outset that I don't really know what I'm doing and none of this is tested very much, so it's kind of experimental at this point.
As far as the database is concerned, it seems that it is possible in mysql to make the cur_title (and related fields) case-insensitive within mysql itself. So I did that, as a kind of experiment. The idea is to avoid (if possible) storing a second version of the cur_title field (though it may end up being required for reasons that I don't know about yet).
The next issue is that the mediawiki PHP code uses associative arrays here and there with titles as the keys (often as caches and the like). So what I've done (for some of these arrays) is use mb_stringtolower when inserting keys and checking for keys. This involves changes to LinkCache.php, Parser.php, and Title.php (so far). I've attached a .diff file as an example (remember, this is just an experiment so far).
I haven't done a whole lot of testing, but so far this seems (at least superficially) to do what I want. I can create new pages with whatever case for title that I want, and links work even if case doesn't match.
What will need some experimenting is to figure out what this breaks -- I'm sure it breaks something!
One interesting thing I've noticed is that the "headline" on a page is controlled by the URL you use to reach it, rather than the cur_title of the page itself. That is, the headline matches the case of the URL, rather than the case of the actual cur_title. I suppose what I could do is normalize the URL generated for the link so that it matches the case of cur_title (rather than the case in the wikitext). Either that or change the code that generates the headline to use cur_title's case, but I suspect there may be a reason why it doesn't do that.
The other thing I've thought of is that I would probably want to special-case "move page" where the new title differs from the old title only in case. In that event, it may be possible simply to update cur_title without doing much else -- depending on what side effects arise.
Anyway, I thought that people might be interested in this -- again, it is just and experiment, and I don't really know the mediawiki code well, so I'm sure there is lots that this wrong with this.
Attachements are striped.
On Apr 2, 2005 5:52 PM, Ryan Rempel rgrempel@gmail.com wrote:
I've been working on this a bit, and thought I'd let you know what I've come up with. I should say at the outset that I don't really know what I'm doing and none of this is tested very much, so it's kind of experimental at this point.
As far as the database is concerned, it seems that it is possible in mysql to make the cur_title (and related fields) case-insensitive within mysql itself. So I did that, as a kind of experiment. The idea is to avoid (if possible) storing a second version of the cur_title field (though it may end up being required for reasons that I don't know about yet).
The next issue is that the mediawiki PHP code uses associative arrays here and there with titles as the keys (often as caches and the like). So what I've done (for some of these arrays) is use mb_stringtolower when inserting keys and checking for keys. This involves changes to LinkCache.php, Parser.php, and Title.php (so far). I've attached a .diff file as an example (remember, this is just an experiment so far).
I haven't done a whole lot of testing, but so far this seems (at least superficially) to do what I want. I can create new pages with whatever case for title that I want, and links work even if case doesn't match.
What will need some experimenting is to figure out what this breaks -- I'm sure it breaks something!
One interesting thing I've noticed is that the "headline" on a page is controlled by the URL you use to reach it, rather than the cur_title of the page itself. That is, the headline matches the case of the URL, rather than the case of the actual cur_title. I suppose what I could do is normalize the URL generated for the link so that it matches the case of cur_title (rather than the case in the wikitext). Either that or change the code that generates the headline to use cur_title's case, but I suspect there may be a reason why it doesn't do that.
The other thing I've thought of is that I would probably want to special-case "move page" where the new title differs from the old title only in case. In that event, it may be possible simply to update cur_title without doing much else -- depending on what side effects arise.
Anyway, I thought that people might be interested in this -- again, it is just and experiment, and I don't really know the mediawiki code well, so I'm sure there is lots that this wrong with this.
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
On Apr 2, 2005 4:56 PM, Jamie Bliss astronouth7303@gmail.com wrote:
Attachements are stripped.
Ah, I hadn't thought of that. Here's a link to the diff I mentioned:
http://homepage.mac.com/ryanrempel/case.diff
I should add that it is a diff against version 1.4.0. (I haven't looked into verson 1.5).
mediawiki-l@lists.wikimedia.org