Re: [Wikitech-l] Case insensitive links (not just titles).

8 Mar 2008

Pherhaps, however looking over the regex: /^(.+?)_*:_*(.*)$/S
Checking over it with my regex tool, I notice that when it encounters a 
_ it ends up doubling back over it when not followed by a : . Not to 
mention that for each character it needs to check that it's any 
character, or an underscore/followed by any, and if that's followed by a : .
I think that using a string function to find the first : in the string 
(CPU's are best at incrementation so that's nothing), and then trimming 
would be faster than using the regex.

Oh, ya, also there is something to remember. With the new format of 
normalization the splitting should NOT trim whitespace as the current 
setup does. If that were done then it would be eliminating whitespace 
from the title which someone's altered normalization may actually wish 
to keep.
So a altered version of that regex to suit, would be: /^(.+?):(.*)$/S 
which most definitely is no where near as efficient as a simple find : 
and split. I'll probably use list() and explode() actually.

On another note, I noticed something with the normalization. While : is 
the standard separator, abstracting the normalization process like this 
is actually loosening the definition of what is what in a title, while 
still keeping it stable. Honestly, if someone changed the methods used 
to prefix things, and altered the splitting sequence, someone could 
probably change MediaWiki to use something like :: as the separator 
instead. If they went to even more work, they could probably introduce a 
special type of Namespace to MediaWiki which could use a different kind 
of prefix, or even restrict to inclusion of only certain types of pages. 
(Basically, wiki like card game wiki could force their package redirects 
and card ids into special namespaces dedicated to them).
Actually, in light of that, I might add another hook or two, or clean up 
some of the title functions to properly abstract the prefixing to where 
it should be instead of mixing it up all over the place.

~Daniel Friesen(Dantman) of:
-The Gaiapedia (http://gaia.wikia.com)
-Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
-and Wiki-Tools.com (http://wiki-tools.com)

Platonides wrote:
...
  DanTMan wrote:

  So to cut down on that, I'm going to try
using normal string 
 functions to pull out the prefixes and trim them off. A strpos, substr, 
 and trim set together is much quicker than a full blown regex pattern match.

 Not always. Remember that the PHP code surrounding that functions is 
 interpreted, while the regex call is run on compiled code.
 I think some of the sysadmins remarked that the use of regex *improved* 
 the perfomance.
 I'm not saying you shouldn't change it to traditional means, just that 
 time should be checked to be sure it's not slower.

 _______________________________________________
 Wikitech-l mailing list
 Wikitech-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Case insensitive links (not just titles).