Max showed me how to get the file page url from the api. So if all we have is the image name we can get the file page url automagically. I attached a sample query to: https://trello.com/c/cXEMxGb3/8-5-retrieve-file-metadata-from-commonsmetadat...
On Fri, Dec 5, 2014 at 3:52 PM, Brion Vibber bvibber@wikimedia.org wrote:
Per request in meeting, thought I'd stick it on the public list for references. :)
As I recall there should be three possible URL formats for images embedded in <img> tags in wiki pages or returned as thumbnails via the API:
http(s)?:// upload.wikimedia.org/(project)/(subdomain)/(hash1)/(hash2)/(base-filename) ^ original-size images
http(s)?:// upload.wikimedia.org/(project)/(subdomain)/(hash1)/(hash2)/thumb/(base-filename)/(size)px(possible-other-options)-(base-filename)(.render-extension) ? ^ thumbnails
http(s)?:// upload.wikimedia.org/(project)/(subdomain)/(hash1)/(hash2)/thumb/(base-filename)/(size)px(possible-other-options)-thumbnail.(render-extension) ^ this last is used in cases where the filename is very very long and we can't actually prepend all the options to the filename (happens mostly in South Asian languages where UTF-8 is 3 bytes per letter)
- project: 'wikipedia' in all cases we need to handle; local files on
Wiktionary etc will have it separate but we don't use these.
- subdomain: language 'en' etc for Wikipedias, subproject for special-case
wikis like Commons/'commons'
- hash1: first digit of md5 hash of the filename (you don't need to use
this here, consider it opaque)
- hash2: first 2 digits of md5 hash of the filename
- base-filename: the base filename -- you want this! This is the raw
filename for files served at original size; thumbnails will use it as a directory component.
- render-extension: files other than PNG, GIF, and JPEG are rendered to
one of those, usually PNG. So you'll see things like ".svg.png" at times -- but never ".png.png". These only appear on thumbnails.
- size: thumbnails are always given with the pixel size.
- possible-other-options: Note that other options may include a page
number for PDF, DjVu, or TIFF files, or a time position for video thumbnails. To avoid parsing that stuff out, consider using the subdirectory base name on thumbnails if possible.
-- brion
Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l