Re: [Wikitech-l] How to decode URL to article title?

26 Apr 2007

On 4/25/07, Neil Harris &lt;usenet(a)tonal.clara.co.uk&gt; wrote:
...
  To reverse the process, first percent-decode the URL
as needed, then
 decode the resulting UTF-8 byte string into Unicode.

 For example,

 Fabry-P%C3%A9rot_interferometer

 decodes to

 Fabry-Pérot interferometer

 ...since %C3%A9 decodes to the two bytes 0xC3 0xA9, which is the UTF-8
 encoding of Unicode code point U+00E9, which encodes the character "é". 
That step is unnecessary if you're using a language like PHP1-5 that's
encoding-agnostic.  It will decode to bytes that can be directly
output to a UTF-8-encoded page or stream, where they'll display
correctly.  The conversion step is only possibly useful if you use a
language that distinguishes between Unicode and binary strings, and
it's not necessary there.  The only thing is to be sure that whatever
you're passing it to or processing it with will interpret it as UTF-8,
if that distinction is relevant (which it probably is if the display
name is what's desired).

Basically, yes, it's standard urldecode() followed by replacement of
underscores with spaces.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] How to decode URL to article title?