Per request in meeting, thought I'd stick it on the public list for references. :)
As I recall there should be three possible URL formats for images embedded in <img> tags in wiki pages or returned as thumbnails via the API:
http(s)?:// upload.wikimedia.org/(project)/(subdomain)/(hash1)/(hash2)/(base-filename) ^ original-size images
http(s)?:// upload.wikimedia.org/(project)/(subdomain)/(hash1)/(hash2)/thumb/(base-filename)/(size)px(possible-other-options)-(base-filename)(.render-extension) ? ^ thumbnails
http(s)?:// upload.wikimedia.org/(project)/(subdomain)/(hash1)/(hash2)/thumb/(base-filename)/(size)px(possible-other-options)-thumbnail.(render-extension) ^ this last is used in cases where the filename is very very long and we can't actually prepend all the options to the filename (happens mostly in South Asian languages where UTF-8 is 3 bytes per letter)
* project: 'wikipedia' in all cases we need to handle; local files on Wiktionary etc will have it separate but we don't use these. * subdomain: language 'en' etc for Wikipedias, subproject for special-case wikis like Commons/'commons' * hash1: first digit of md5 hash of the filename (you don't need to use this here, consider it opaque) * hash2: first 2 digits of md5 hash of the filename * base-filename: the base filename -- you want this! This is the raw filename for files served at original size; thumbnails will use it as a directory component. * render-extension: files other than PNG, GIF, and JPEG are rendered to one of those, usually PNG. So you'll see things like ".svg.png" at times -- but never ".png.png". These only appear on thumbnails. * size: thumbnails are always given with the pixel size. * possible-other-options: Note that other options may include a page number for PDF, DjVu, or TIFF files, or a time position for video thumbnails. To avoid parsing that stuff out, consider using the subdirectory base name on thumbnails if possible.
-- brion