Re: [Wikitech-l] URLs that aren't cool...

28 Jul 2009


      On Tue, Jul 28, 2009 at 11:53 AM, Paul Houlepaul@ontology2.com wrote:
...
I've been looking at the id structure of dbpedia and wikipedia and
finally found an example where case sensitivity issues really bite.
We should keep in mind that case isn't so clear-cut if you move away
from English, though -- is "groß" the same as "GROSS" and thus the
same as "gross"?  How about languages that don't even have bijections
between uppercase and lowercase if you stick to the same dialect?
(I'm pretty sure there are some; don't some language strip diacritics
from uppercase letters?)  There's probably some Unicode standard on
normalization with respect to case, but it's not actually so simple in
an international context.
That said, I think case-insensitivity would be a good thing to support
in the long run, optionally, and that it would probably be suitable
for all Wikipedias.  Or at least almost all, if there are languages
out there where case insensitivity is a real headache -- hopefully
not, since most languages don't have letter case at all.  At any rate
it would be good on enwiki.
But it would require a lot of tedious and error-prone conversion of
old code.  Everything tends to assume that a)
$title->getPrefixedText() is what should be displayed to the user, but
b) two titles are equal if and only if their
$title->getPrefixedText()s are equal.  Likewise for
$title->getPrefixedDbKey().  Those would need to be systematically and
thoroughly fixed.  We'd also have to add a field to the page table or
such to store the normalized form of the title, and fiddle with the
indexes appropriately, and update all other tables to use the
normalized form.  A lot of work.
(But at least we could get rid of the silly Text/DbKey distinction
while we're doing this.  I've heard recent MySQL versions actually
support storage of ASCII space characters in text fields!)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] URLs that aren't cool...