Hi all,

we have just deployed a new URL format for MediaViewer [0], I am submitting it here for comments and for the benefit of people who have to do something similar in other contexts.

MediaViewer stores the name of the image in the hash part of the URL so one can share links to a page with a specific image open in the lightbox. (We considered using the History API [1] to change the path or the query part, but that degrades poorly.) I have looked at three options:
  1. Just put the file name as-is (with spaces replaced by underscores) in the URL fragment part.
    Pro: readable file names in URLs, easy to generate.
    Con: technically not a valid URI. [2] (It would be a valid IRI, probably, but browser support for that is not so great, so non-ASCII bytes might get encoded in unexpected ways.) Creates nasty usability and security issues (injection vulnerabilities, RTL characters, characters which break autolinking). Would make it very hard to introduce more complex URL formats later, as file names can contain pretty much any character.
  2. Use percent encoding (with underscores for spaces).
    Pro: this is the standard way of encoding fragments. [2][3] Always results in a valid URI. Readable file names in Firefox. Easy to generate on-wiki (e.g. with {{urlencode}})
    Con: Non-Latin filenames look horrible in any browser that's not Firefox.
  3. Use MediaWiki anchor encoding (like percent encoding, but use a dot instead of a percent sign).
    This would have the advantage that links can be generated in wikitext very conveniently, using the [[#...]] syntax. Unfortunately the way MediaWiki does percent encoding is intrinsically broken (the dot itself does not get encoded, but it does get decoded when followed by suitable characters, so file names cannot get roundtripped safely), so this is not an option.
We went with option 2, so URLs look like this:

https://www.mediawiki.org/wiki/Lightbox_demo#mediaviewer/File:Swallow_flying_drinking.jpg

https://www.mediawiki.org/wiki/Lightbox_demo#mediaviewer/File:%E0%AE%85%E0%AE%A3%E0%AE%BF%E0%AE%B2%E0%AF%8D-3-%E0%AE%A4%E0%AF%86%E0%AE%A9%E0%AF%8D%E0%AE%A9%E0%AF%88%E0%AE%AF%E0%AE%BF%E0%AE%A9%E0%AF%8D_%E0%AE%B5%E0%AE%B3%E0%AE%B0%E0%AF%8D%E0%AE%A8%E0%AE%BF%E0%AE%B2%E0%AF%88.jpg

One issue that we ran into is that window.location.hash behaves weirdly with percent-encoded hashes in Firefox [4], but that's easy to avoid once you know about it. Other than that, it seems to work reliably.


[0] https://www.mediawiki.org/wiki/Multimedia/Media_Viewer
[1] http://diveintohtml5.info/history.html
[2] http://tools.ietf.org/html/rfc3986#section-3.5
[3] https://tools.ietf.org/html/rfc3987#section-3.1
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=483304