Re: [Wikitech-l] How to decode URL to article title?

27 Apr 2007

Reid Priedhorsky wrote:
...

 OK, to summarize what this thread seems to be saying, the procedure is:

 1. Split the title from the rest of the URL.

 2. Percent-decode the title, yielding a UTF-8 byte string.

 3. Convert the byte string into a Unicode string.

 4. Replace underscores with spaces.

 Step 4 yields the article title, which is what appears in the XML dumps. 
Hi folks,

Thanks again for your help. Here's the Python code I came up with, in 
the hopes that it might be useful for others.

   import urllib
   urllib.unquote(title).decode('UTF-8').replace('_', ' ')

Reid