Hello,
I would like to change the separator character on my mediawiki installation from the current underscore format to dot format.
I have read a previous exchange in which it was noted that the underscore is hard-coded into mediawiki in many places, but i feel it would be worth the effort.
What are the places/files/directories where I should start looking? Thanks!
You'd have to look through most of the source code for mediawiki, methinks. Big project. WHy, if I may ask?
On May 13, 2006, at 1:09 PM, Seun Osewa wrote:
Hello,
I would like to change the separator character on my mediawiki installation from the current underscore format to dot format.
I have read a previous exchange in which it was noted that the underscore is hard-coded into mediawiki in many places, but i feel it would be worth the effort.
What are the places/files/directories where I should start looking? Thanks!
-- Seun Osewa http://www.nairaland.com _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Well,
I'm launching a new site that I'll be developing for years or more, and I feel that if this project that'll take maybe a few hours in a weekend will affect my search engine rankings by even just 5%, it's worth getting it over with right now.
If the community will help me, then instead of just changing the underscores to dots i might even be able to create a patch which makes it possible for people to use any character as the separator character.
How many places do you think I'll need to modify? 5? 10? 20? 50?
Seun. http://www.nairaland.com/
On 5/13/06, Elliott F. Cable ecable@avxw.com wrote:
You'd have to look through most of the source code for mediawiki, methinks. Big project. WHy, if I may ask?
On May 13, 2006, at 1:09 PM, Seun Osewa wrote:
Hello,
I would like to change the separator character on my mediawiki installation from the current underscore format to dot format.
I have read a previous exchange in which it was noted that the underscore is hard-coded into mediawiki in many places, but i feel it would be worth the effort.
What are the places/files/directories where I should start looking? Thanks!
-- Seun Osewa http://www.nairaland.com _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
... How will this affect you're search engine rankings? And you'd have to modify them ALL, every single one - if you missed one, it would break major things, would it not? The whole linking scheme, really.
I'd say 100+. I could be wrong, but in my experience hacking mediawiki, everything is so tightly interwoven that changing even a small tiny thing that you think would only appear once, actually appears in 10 different places. Making this a variable in some ways wouldn't be a bad idea, but in others, it wouldn't work - what about interwiki linking? Templates? using dots in titles that are NOT spaces?
But again, how would this help with google?
On May 13, 2006, at 1:19 PM, Seun Osewa wrote:
Well,
I'm launching a new site that I'll be developing for years or more, and I feel that if this project that'll take maybe a few hours in a weekend will affect my search engine rankings by even just 5%, it's worth getting it over with right now.
If the community will help me, then instead of just changing the underscores to dots i might even be able to create a patch which makes it possible for people to use any character as the separator character.
How many places do you think I'll need to modify? 5? 10? 20? 50?
Seun. http://www.nairaland.com/
On 5/13/06, Elliott F. Cable ecable@avxw.com wrote:
You'd have to look through most of the source code for mediawiki, methinks. Big project. WHy, if I may ask?
On May 13, 2006, at 1:09 PM, Seun Osewa wrote:
Hello,
I would like to change the separator character on my mediawiki installation from the current underscore format to dot format.
I have read a previous exchange in which it was noted that the underscore is hard-coded into mediawiki in many places, but i feel it would be worth the effort.
What are the places/files/directories where I should start looking? Thanks!
-- Seun Osewa http://www.nairaland.com _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
-- Seun Osewa http://www.nairaland.com [vast Nigerian forum] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
100? Ouch!
Google recognizes keywords in urls that are separated by dots or hyphens, but words separared by underscores are viewed as one unit.
A Google rep even recommended hyphens over underscores: http://www.webmasterworld.com/forum3/23564.htm http://www.webmasterworld.com/forum3/4572.htm
The difference is really small, and if a project gets a lot of link love (like wikipedia) it probably wouldn't matter, I guess.
On 5/13/06, Elliott F. Cable ecable@avxw.com wrote:
... How will this affect you're search engine rankings? And you'd have to modify them ALL, every single one - if you missed one, it would break major things, would it not? The whole linking scheme, really.
I'd say 100+. I could be wrong, but in my experience hacking mediawiki, everything is so tightly interwoven that changing even a small tiny thing that you think would only appear once, actually appears in 10 different places. Making this a variable in some ways wouldn't be a bad idea, but in others, it wouldn't work - what about interwiki linking? Templates? using dots in titles that are NOT spaces?
But again, how would this help with google?
On May 13, 2006, at 1:19 PM, Seun Osewa wrote:
Well,
I'm launching a new site that I'll be developing for years or more, and I feel that if this project that'll take maybe a few hours in a weekend will affect my search engine rankings by even just 5%, it's worth getting it over with right now.
If the community will help me, then instead of just changing the underscores to dots i might even be able to create a patch which makes it possible for people to use any character as the separator character.
How many places do you think I'll need to modify? 5? 10? 20? 50?
Seun. http://www.nairaland.com/
On 5/13/06, Elliott F. Cable ecable@avxw.com wrote:
You'd have to look through most of the source code for mediawiki, methinks. Big project. WHy, if I may ask?
On May 13, 2006, at 1:09 PM, Seun Osewa wrote:
Hello,
I would like to change the separator character on my mediawiki installation from the current underscore format to dot format.
I have read a previous exchange in which it was noted that the underscore is hard-coded into mediawiki in many places, but i feel it would be worth the effort.
What are the places/files/directories where I should start looking? Thanks!
-- Seun Osewa http://www.nairaland.com _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
-- Seun Osewa http://www.nairaland.com [vast Nigerian forum] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
On 5/13/06, Seun Osewa seun.osewa@gmail.com wrote:
100? Ouch!
Google recognizes keywords in urls that are separated by dots or hyphens, but words separared by underscores are viewed as one unit.
A Google rep even recommended hyphens over underscores: http://www.webmasterworld.com/forum3/23564.htm http://www.webmasterworld.com/forum3/4572.htm
The difference is really small, and if a project gets a lot of link love (like wikipedia) it probably wouldn't matter, I guess.
Two comments.
1) Both of those threads are quite old, and google's algorithms don't stand still.
2) As one of those threads pointed out we are talking about separators in the urls here. Mediawiki turns spaces into underscores in the article urls, but the page content will have the words separated by spaces, and presumably the page content counts more than the url.
If your still tempted to change "/" to "." in MediaWiki think about this first: * it would be virtually impossible to find the occurences because you would need to search for "_" which is used about a zillion times per file * if you did ever succeed in making the change, you would have to redo the work from scratch every time MediaWiki changed (you of course couldn't guarantee there would not be other places to look each time), or you would be stuck forever on the release version you changed.
Don't think it's a good idea.
I have to say it'd be cool if you'd integrate this, but I'd suggest dashes (-) instead of periods (.) - it'd make more sense to the end- user, and be more user-friendly. Perhaps, just make it an option as you suggested - a configuration variable. Check with brion about getting into the SVN team to comit you're changes.
On May 13, 2006, at 2:22 PM, Seun Osewa wrote:
100? Ouch!
Google recognizes keywords in urls that are separated by dots or hyphens, but words separared by underscores are viewed as one unit.
A Google rep even recommended hyphens over underscores: http://www.webmasterworld.com/forum3/23564.htm http://www.webmasterworld.com/forum3/4572.htm
The difference is really small, and if a project gets a lot of link love (like wikipedia) it probably wouldn't matter, I guess.
On 5/13/06, Elliott F. Cable ecable@avxw.com wrote:
... How will this affect you're search engine rankings? And you'd have to modify them ALL, every single one - if you missed one, it would break major things, would it not? The whole linking scheme, really.
I'd say 100+. I could be wrong, but in my experience hacking mediawiki, everything is so tightly interwoven that changing even a small tiny thing that you think would only appear once, actually appears in 10 different places. Making this a variable in some ways wouldn't be a bad idea, but in others, it wouldn't work - what about interwiki linking? Templates? using dots in titles that are NOT spaces?
But again, how would this help with google?
On May 13, 2006, at 1:19 PM, Seun Osewa wrote:
Well,
I'm launching a new site that I'll be developing for years or more, and I feel that if this project that'll take maybe a few hours in a weekend will affect my search engine rankings by even just 5%, it's worth getting it over with right now.
If the community will help me, then instead of just changing the underscores to dots i might even be able to create a patch which makes it possible for people to use any character as the separator character.
How many places do you think I'll need to modify? 5? 10? 20? 50?
Seun. http://www.nairaland.com/
On 5/13/06, Elliott F. Cable ecable@avxw.com wrote:
You'd have to look through most of the source code for mediawiki, methinks. Big project. WHy, if I may ask?
On May 13, 2006, at 1:09 PM, Seun Osewa wrote:
Hello,
I would like to change the separator character on my mediawiki installation from the current underscore format to dot format.
I have read a previous exchange in which it was noted that the underscore is hard-coded into mediawiki in many places, but i feel it would be worth the effort.
What are the places/files/directories where I should start looking? Thanks!
-- Seun Osewa http://www.nairaland.com _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
-- Seun Osewa http://www.nairaland.com [vast Nigerian forum] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
-- Seun Osewa http://www.nairaland.com [vast Nigerian forum] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
There a very easy system that's used in the SImplemachines Engine:
1) Use PHP output buffering to store the HTML output of the software and then run a regex on it to convert all underscores in internal urls to dots (or another separator character)
2) Write a short function in index.php to convert the request url dots back to underscores.
I haven't checked code to see if output buffering is already being used. What do you think?
On 5/14/06, Elliott F. Cable ecable@avxw.com wrote:
I have to say it'd be cool if you'd integrate this, but I'd suggest dashes (-) instead of periods (.) - it'd make more sense to the end- user, and be more user-friendly. Perhaps, just make it an option as you suggested - a configuration variable. Check with brion about getting into the SVN team to comit you're changes.
On May 13, 2006, at 2:22 PM, Seun Osewa wrote:
100? Ouch!
Google recognizes keywords in urls that are separated by dots or hyphens, but words separared by underscores are viewed as one unit.
A Google rep even recommended hyphens over underscores: http://www.webmasterworld.com/forum3/23564.htm http://www.webmasterworld.com/forum3/4572.htm
The difference is really small, and if a project gets a lot of link love (like wikipedia) it probably wouldn't matter, I guess.
On 5/13/06, Elliott F. Cable ecable@avxw.com wrote:
... How will this affect you're search engine rankings? And you'd have to modify them ALL, every single one - if you missed one, it would break major things, would it not? The whole linking scheme, really.
I'd say 100+. I could be wrong, but in my experience hacking mediawiki, everything is so tightly interwoven that changing even a small tiny thing that you think would only appear once, actually appears in 10 different places. Making this a variable in some ways wouldn't be a bad idea, but in others, it wouldn't work - what about interwiki linking? Templates? using dots in titles that are NOT spaces?
But again, how would this help with google?
On May 13, 2006, at 1:19 PM, Seun Osewa wrote:
Well,
I'm launching a new site that I'll be developing for years or more, and I feel that if this project that'll take maybe a few hours in a weekend will affect my search engine rankings by even just 5%, it's worth getting it over with right now.
If the community will help me, then instead of just changing the underscores to dots i might even be able to create a patch which makes it possible for people to use any character as the separator character.
How many places do you think I'll need to modify? 5? 10? 20? 50?
Seun. http://www.nairaland.com/
On 5/13/06, Elliott F. Cable ecable@avxw.com wrote:
You'd have to look through most of the source code for mediawiki, methinks. Big project. WHy, if I may ask?
On May 13, 2006, at 1:09 PM, Seun Osewa wrote:
Hello,
I would like to change the separator character on my mediawiki installation from the current underscore format to dot format.
I have read a previous exchange in which it was noted that the underscore is hard-coded into mediawiki in many places, but i feel it would be worth the effort.
What are the places/files/directories where I should start looking? Thanks!
-- Seun Osewa http://www.nairaland.com _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
-- Seun Osewa http://www.nairaland.com [vast Nigerian forum] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
-- Seun Osewa http://www.nairaland.com [vast Nigerian forum] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Answer to Seun Osewa,
Use the GetLocalURL hook to change _ to - or what ever you want :
'GetLocalURL': modify local URLs as output into page links $title: Title object of page $url: string value as output (out parameter, can modify) $query: query options passed to Title::getLocalURL()
For example, in an extension :
<?php # Change URL, replace _ by - $wgHooks['GetLocalURL'][] = 'underscoreToHyphen';
//wfRunHooks( 'GetLocalURL', array( &$this, &$url, $query ) ); function underscoreToHyphen($title, $url, $query) { $url = str_replace("_", "-", $url); #wfDebug('extension/underscoreToHyphen : $title,$url='.$url.',$query='.$query."\n"); } ?>
2006/5/14, Seun Osewa seun.osewa@gmail.com:
There a very easy system that's used in the SImplemachines Engine:
- Use PHP output buffering to store the HTML output of the software
and then run a regex on it to convert all underscores in internal urls to dots (or another separator character)
- Write a short function in index.php to convert the request url dots
back to underscores.
I haven't checked code to see if output buffering is already being used. What do you think?
On 5/14/06, Elliott F. Cable ecable@avxw.com wrote:
I have to say it'd be cool if you'd integrate this, but I'd suggest dashes (-) instead of periods (.) - it'd make more sense to the end- user, and be more user-friendly. Perhaps, just make it an option as you suggested - a configuration variable. Check with brion about getting into the SVN team to comit you're changes.
On May 13, 2006, at 2:22 PM, Seun Osewa wrote:
100? Ouch!
Google recognizes keywords in urls that are separated by dots or hyphens, but words separared by underscores are viewed as one unit.
A Google rep even recommended hyphens over underscores: http://www.webmasterworld.com/forum3/23564.htm http://www.webmasterworld.com/forum3/4572.htm
The difference is really small, and if a project gets a lot of link love (like wikipedia) it probably wouldn't matter, I guess.
On 5/13/06, Elliott F. Cable ecable@avxw.com wrote:
... How will this affect you're search engine rankings? And you'd have to modify them ALL, every single one - if you missed one, it would break major things, would it not? The whole linking scheme, really.
I'd say 100+. I could be wrong, but in my experience hacking mediawiki, everything is so tightly interwoven that changing even a small tiny thing that you think would only appear once, actually appears in 10 different places. Making this a variable in some ways wouldn't be a bad idea, but in others, it wouldn't work - what about interwiki linking? Templates? using dots in titles that are NOT spaces?
But again, how would this help with google?
On May 13, 2006, at 1:19 PM, Seun Osewa wrote:
Well,
I'm launching a new site that I'll be developing for years or more, and I feel that if this project that'll take maybe a few hours in a weekend will affect my search engine rankings by even just 5%, it's worth getting it over with right now.
If the community will help me, then instead of just changing the underscores to dots i might even be able to create a patch which makes it possible for people to use any character as the separator character.
How many places do you think I'll need to modify? 5? 10? 20? 50?
Seun. http://www.nairaland.com/
On 5/13/06, Elliott F. Cable ecable@avxw.com wrote:
You'd have to look through most of the source code for mediawiki, methinks. Big project. WHy, if I may ask?
On May 13, 2006, at 1:09 PM, Seun Osewa wrote: > Hello, > > I would like to change the separator character on my mediawiki > installation from the current underscore format to dot format. > > I have read a previous exchange in which it was noted that the > underscore is hard-coded into mediawiki in many places, but i > feel it > would be worth the effort. > > What are the places/files/directories where I should start > looking? Thanks! > > -- > Seun Osewa > http://www.nairaland.com > _______________________________________________ > MediaWiki-l mailing list > MediaWiki-l@Wikimedia.org > http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
-- Seun Osewa http://www.nairaland.com [vast Nigerian forum] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
-- Seun Osewa http://www.nairaland.com [vast Nigerian forum] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
-- Seun Osewa http://www.nairaland.com [vast Nigerian forum] _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
SICK! Dude, thanks - didn't think of that.
On May 14, 2006, at 10:26 AM, iubito wrote:
<?php # Change URL, replace _ by - $wgHooks['GetLocalURL'][] = 'underscoreToHyphen'; //wfRunHooks( 'GetLocalURL', array( &$this, &$url, $query ) ); function underscoreToHyphen($title, $url, $query) { $url = str_replace("_", "-", $url); #wfDebug('extension/underscoreToHyphen : $title,$url='.$url.',$query='.$query."\n"); } ?>
Wow. Thanks! It seems to be working perfectly with just this one change.
I'll keep testing. I'm very new to this code. Thanks again, iubito!
On 5/14/06, iubito iubito@gmail.com wrote:
Answer to Seun Osewa,
Use the GetLocalURL hook to change _ to - or what ever you want :
'GetLocalURL': modify local URLs as output into page links $title: Title object of page $url: string value as output (out parameter, can modify) $query: query options passed to Title::getLocalURL()
For example, in an extension :
<?php # Change URL, replace _ by - $wgHooks['GetLocalURL'][] = 'underscoreToHyphen'; //wfRunHooks( 'GetLocalURL', array( &$this, &$url, $query ) ); function underscoreToHyphen($title, $url, $query) { $url = str_replace("_", "-", $url); #wfDebug('extension/underscoreToHyphen : $title,$url='.$url.',$query='.$query."\n"); } ?>
On 5/14/06, Seun Osewa seun.osewa@gmail.com wrote:
Wow. Thanks! It seems to be working perfectly with just this one change.
I'll keep testing. I'm very new to this code. Thanks again, iubito!
Perhaps after some testing you could wrap it into an extension and make a reference to it on meta.
Ok, I've noticed a problem. After making that change: - Pages linking to new pages created with the new separator character always show the link in bright red, wrongly indicating that the pages have not been created. - For all new pages created with the new separator character, whatlinkshere doesn't work.
I'm guessing there's another hard-coded underscore that has to do with reference counting that I'll need to change. Help!
On 5/15/06, Sy Ali sy1234@gmail.com wrote:
On 5/14/06, Seun Osewa seun.osewa@gmail.com wrote:
Wow. Thanks! It seems to be working perfectly with just this one
change.
I'll keep testing. I'm very new to this code. Thanks again, iubito!
Perhaps after some testing you could wrap it into an extension and make a reference to it on meta. _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
On 15/05/06, Seun Osewa seun.osewa@gmail.com wrote:
Ok, I've noticed a problem. After making that change:
- Pages linking to new pages created with the new separator character always
show the link in bright red, wrongly indicating that the pages have not been created.
- For all new pages created with the new separator character, whatlinkshere
doesn't work.
I'm guessing there's another hard-coded underscore that has to do with reference counting that I'll need to change. Help!
Most of the code assumes underscores and spaces are interchangeable.
Rob Church
Moin,
On Monday 15 May 2006 12:18, Rob Church wrote:
On 15/05/06, Seun Osewa seun.osewa@gmail.com wrote:
Ok, I've noticed a problem. After making that change:
- Pages linking to new pages created with the new separator character
always show the link in bright red, wrongly indicating that the pages have not been created.
- For all new pages created with the new separator character,
whatlinkshere doesn't work.
I'm guessing there's another hard-coded underscore that has to do with reference counting that I'll need to change. Help!
Most of the code assumes underscores and spaces are interchangeable.
And don't forget extensions as well as wiki2xml, which also makes this assumption and will thus break with your change.
So, the doctor says: Don't change the separator character 'cuz it hurts.
Best wishes,
Tels
mediawiki-l@lists.wikimedia.org