External links in pages are not being handled properly. The external hostname in the url is being removed when the page is rendered into html, thus causing an erroneous URL to be substituted in. Has anyone had this problem before and if so, how was it fixed?
That seems very strange - can you send us a snippet from a page (the wikitext) that's causing the problem?
-- Jim R. Wilson (jimbojw)
On 3/29/07, Chuck Harding charding@llnl.gov wrote:
External links in pages are not being handled properly. The external hostname in the url is being removed when the page is rendered into html, thus causing an erroneous URL to be substituted in. Has anyone had this problem before and if so, how was it fixed?
-- Charles D. (Chuck) Harding charding@llnl.gov Voice: 925-423-8879 Senior Computer Associate LC Operations Fax: 925-423-6961 B453 R2253/2131 Pager: 00500 Ops Room: 925-422-3743 Ops: 925-422-0484 Lawrence Livermore National Laboratory Computation Directorate Livermore, CA USA http://www.llnl.gov GPG Public Key ID: B9EB6601 -------------------- http://tinyurl.com/5w5ey ------------------------- -- Profanity is the one language all programmers know best. --
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/mediawiki-l
On Thu, 29 Mar 2007, Jim Wilson wrote:
That seems very strange - can you send us a snippet from a page (the wikitext) that's causing the problem?
-- Jim R. Wilson (jimbojw)
=== How to edit a Wiki page ===
: A useful link on How to edit a Wiki page can be found [http://en.wikipedia.org/wiki/Wikipedia:How_to_edit_a_page here]
gets rendered as:
<a name="How_to_edit_a_Wiki_page"></a><h3> <span class="mw-headline"> How to edit a Wiki page </span></h3> <dl><dd> A useful link on How to edit a Wiki page can be found <a href="http:/wiki/Wikipedia:How_to_edit_a_page" class="external text" title="http:/wiki/Wikipedia:How_to_edit_a_page" rel="nofollow">here</a> </dd></dl>
notice that en.wikipedia.org is left out of the URL. The wierd thing is that when I click on the link on the rendered page, it actually does take me to a wikipedia article but not on how to edit a page, it takes me to the article on Wiki. Just to make sure it wasn't anything wierd about the URLs, here are some other messed up links:
==== Favorite links ====
:* [http://www.harding-family.org My family web pages] :* [http://k6ckt.home.comcast.net My Ham Radio web page] :* [http://nimrodvideo.home.comcast.net My Video/Multimedia Production Company web page] :* [http://kofc4588.home.comcast.net Knights of Columbus Fr. Patric Power Council #4588 web site] :* [http://www.arrl.org American Radio Relay League] :* [http://www.hello-radio.org Hello Ham Radio] :* [http://www.emergency-radio.org/ Emergancy Radio] :* [http://www.livermoreark.org '''L'''ivermore '''A'''mateur '''R'''adio '''K'''lub]
and the resulting HTML:
<a name="Favorite_links"></a><h4><span class="editsection">[<a href="/wiki/index.php?title=User:Charding&action=edit&section=1" title="Edit section: Favorite links">edit</a>]</span> <span class="mw-headline"> Favorite links </span></h4> <dl><dd><ul><li> <a href="http:" class="external text" title="http:" rel="nofollow">My family web pages</a>
</li><li> <a href="http:" class="external text" title="http:" rel="nofollow">My Ham Radio web page</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">My Video/Multimedia Production Company web page</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">Knights of Columbus Fr. Patric Power Council #4588 web site</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">American Radio Relay League</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">Hello Ham Radio</a> </li><li> <a href="http:/" class="external text" title="http:/" rel="nofollow">Emergancy Radio</a>
</li><li> <a href="http:" class="external text" title="http:" rel="nofollow"><b>L</b>ivermore <b>A</b>mateur <b>R</b>adio <b>K</b>lub</a> </li></ul> </dd></dl>
Even though I upgraded mediawiki from 1.6.7 to 1.9.3, the rest of the supporting packages were not touched - apache, php, and mysql are all the same.
* MediaWiki: 1.9.3 * PHP: 5.1.2 (apache2handler) * MySQL: 5.0.18
# httpd -v Server version: Apache/2.2.2 Server built: Jun 29 2006 12:13:15
An additional data point: one of our users discovered that by doubling the '/'s in the URL, it gets generated correctly: i.e., http:////www.harding-family.org It looks ugly but it works. I hope this provides more information so that this can be tracked down. I'd hate to have to edit every page that has an external link on it to change http(s)://whatever to http(s):////whatever....
On Thu, 29 Mar 2007, Chuck Harding wrote:
On Thu, 29 Mar 2007, Jim Wilson wrote:
That seems very strange - can you send us a snippet from a page (the wikitext) that's causing the problem?
-- Jim R. Wilson (jimbojw)
=== How to edit a Wiki page ===
: A useful link on How to edit a Wiki page can be found [http://en.wikipedia.org/wiki/Wikipedia:How_to_edit_a_page here]
gets rendered as:
<a name="How_to_edit_a_Wiki_page"></a><h3> <span class="mw-headline"> How to edit a Wiki page </span></h3>
<dl><dd> A useful link on How to edit a Wiki page can be found <a href="http:/wiki/Wikipedia:How_to_edit_a_page" class="external text" title="http:/wiki/Wikipedia:How_to_edit_a_page" rel="nofollow">here</a> </dd></dl>
notice that en.wikipedia.org is left out of the URL. The wierd thing is that when I click on the link on the rendered page, it actually does take me to a wikipedia article but not on how to edit a page, it takes me to the article on Wiki. Just to make sure it wasn't anything wierd about the URLs, here are some other messed up links:
==== Favorite links ====
:* [http://www.harding-family.org My family web pages] :* [http://k6ckt.home.comcast.net My Ham Radio web page] :* [http://nimrodvideo.home.comcast.net My Video/Multimedia Production Company web page] :* [http://kofc4588.home.comcast.net Knights of Columbus Fr. Patric Power Council #4588 web site] :* [http://www.arrl.org American Radio Relay League] :* [http://www.hello-radio.org Hello Ham Radio] :* [http://www.emergency-radio.org/ Emergancy Radio] :* [http://www.livermoreark.org '''L'''ivermore '''A'''mateur '''R'''adio '''K'''lub]
and the resulting HTML:
<a name="Favorite_links"></a><h4><span class="editsection">[<a href="/wiki/index.php?title=User:Charding&action=edit&section=1" title="Edit section: Favorite links">edit</a>]</span> <span class="mw-headline"> Favorite links </span></h4>
<dl><dd><ul><li> <a href="http:" class="external text" title="http:" rel="nofollow">My family web pages</a>
</li><li> <a href="http:" class="external text" title="http:" rel="nofollow">My Ham Radio web page</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">My Video/Multimedia Production Company web page</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">Knights of Columbus Fr. Patric Power Council #4588 web site</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">American Radio Relay League</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">Hello Ham Radio</a> </li><li> <a href="http:/" class="external text" title="http:/" rel="nofollow">Emergancy Radio</a>
</li><li> <a href="http:" class="external text" title="http:" rel="nofollow"><b>L</b>ivermore <b>A</b>mateur <b>R</b>adio <b>K</b>lub</a> </li></ul> </dd></dl>
Even though I upgraded mediawiki from 1.6.7 to 1.9.3, the rest of the supporting packages were not touched - apache, php, and mysql are all the same.
* MediaWiki: 1.9.3 * PHP: 5.1.2 (apache2handler) * MySQL: 5.0.18
# httpd -v Server version: Apache/2.2.2 Server built: Jun 29 2006 12:13:15
After further debugging I have discovered that commenting out line 1251 in Sanitizer.php in the function cleanURL will make the problem go away. This is a call to preg_replace that is attempting to strip out invalid UTF-8 characters from a URL. For some reason the version of PCRE that I compiled with UTF-8 support enabled is not being used, in spite of the fact that I recompiled PHP to use it. So this is really a PHP problem and not a mediawiki problem. So off to the PHP support forums....
On Mon, 2 Apr 2007, Chuck Harding wrote:
An additional data point: one of our users discovered that by doubling the '/'s in the URL, it gets generated correctly: i.e., http:////www.harding-family.org It looks ugly but it works. I hope this provides more information so that this can be tracked down. I'd hate to have to edit every page that has an external link on it to change http(s)://whatever to http(s):////whatever....
On Thu, 29 Mar 2007, Chuck Harding wrote:
On Thu, 29 Mar 2007, Jim Wilson wrote:
That seems very strange - can you send us a snippet from a page (the wikitext) that's causing the problem?
-- Jim R. Wilson (jimbojw)
=== How to edit a Wiki page ===
: A useful link on How to edit a Wiki page can be found [http://en.wikipedia.org/wiki/Wikipedia:How_to_edit_a_page here]
gets rendered as:
<a name="How_to_edit_a_Wiki_page"></a><h3> <span class="mw-headline"> How to edit a Wiki page </span></h3>
<dl><dd> A useful link on How to edit a Wiki page can be found <a href="http:/wiki/Wikipedia:How_to_edit_a_page" class="external text" title="http:/wiki/Wikipedia:How_to_edit_a_page" rel="nofollow">here</a> </dd></dl>
notice that en.wikipedia.org is left out of the URL. The wierd thing is that when I click on the link on the rendered page, it actually does take me to a wikipedia article but not on how to edit a page, it takes me to the article on Wiki. Just to make sure it wasn't anything wierd about the URLs, here are some other messed up links:
==== Favorite links ====
:* [http://www.harding-family.org My family web pages] :* [http://k6ckt.home.comcast.net My Ham Radio web page] :* [http://nimrodvideo.home.comcast.net My Video/Multimedia Production Company web page] :* [http://kofc4588.home.comcast.net Knights of Columbus Fr. Patric Power Council #4588 web site] :* [http://www.arrl.org American Radio Relay League] :* [http://www.hello-radio.org Hello Ham Radio] :* [http://www.emergency-radio.org/ Emergancy Radio] :* [http://www.livermoreark.org '''L'''ivermore '''A'''mateur '''R'''adio '''K'''lub]
and the resulting HTML:
<a name="Favorite_links"></a><h4><span class="editsection">[<a href="/wiki/index.php?title=User:Charding&action=edit&section=1" title="Edit section: Favorite links">edit</a>]</span> <span class="mw-headline"> Favorite links </span></h4>
<dl><dd><ul><li> <a href="http:" class="external text" title="http:" rel="nofollow">My family web pages</a>
</li><li> <a href="http:" class="external text" title="http:" rel="nofollow">My Ham Radio web page</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">My Video/Multimedia Production Company web page</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">Knights of Columbus Fr. Patric Power Council #4588 web site</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">American Radio Relay League</a> </li><li> <a href="http:" class="external text" title="http:" rel="nofollow">Hello Ham Radio</a> </li><li> <a href="http:/" class="external text" title="http:/" rel="nofollow">Emergancy Radio</a>
</li><li> <a href="http:" class="external text" title="http:" rel="nofollow"><b>L</b>ivermore <b>A</b>mateur <b>R</b>adio <b>K</b>lub</a> </li></ul> </dd></dl>
Even though I upgraded mediawiki from 1.6.7 to 1.9.3, the rest of the supporting packages were not touched - apache, php, and mysql are all the same.
* MediaWiki: 1.9.3 * PHP: 5.1.2 (apache2handler) * MySQL: 5.0.18
# httpd -v Server version: Apache/2.2.2 Server built: Jun 29 2006 12:13:15
mediawiki-l@lists.wikimedia.org