Hi,
I've been working on the wikipedia data overlay on our site, multimap.com. One of the things we'd really love to do is add thumbnail images to complement the summary text for an article that we display in our infoboxes. However, we're finding it difficult to programatically generate the URLs to specific images from the wiki tag.
For example, how do i get from:
[[Image:Asperger kl2.jpg|thumb|[[Hans Asperger]] to http://upload.wikimedia.org/wikipedia/en/5/57/Asperger_kl2.jpg
My understanding of the URL structure is thumbnails can be requested from either:
http://upload.wikimedia.org/wikipedia/commons/thumb/X/YY/IMAGENAME/SIZEpx-IM... or http://upload.wikimedia.org/wikipedia/en/thumb/X/YY/IMAGENAME/SIZEpx-IMAGENA...
where X is the first character of the md5 hash of the IMAGENAME, YY are the first two characters of the same hash, IMAGENAME is the image name and SIZE is the size you want in pixels.
The issue is with the /en/ or /commons/ part since it seems it can either and I can't tell of a way to know without actually requesting the image to see if it exists.
Is there a single thumbnail URL I can use to get an image no matter if it exists in commons or not? I noticed there's a thumb.php available too but again it doesn't seem to be a single URL endpoint for all images.
Many many thanks for any help.
Regards, Colm
Colm McMullan Software Engineer multimap e: colm@multimap.com t: +44 (0) 20 7632 7700 (switchboard) 165 Fleet Street, London EC4A 2DY [ Map ยป ] Multi Media Mapping Ltd., trading as Multimap / Company number: 03121505 / VAT registration number: GB 671 8051 34 Registered address: Beaufort House, Tenth Floor, 15 St. Botolph Street, London, EC3A 7EE This email is confidential and may be privileged. It may be read, copied and used only by the intended recipient. If you have received it in error, please contact us immediately.
______________________________________________________________________ A member of the Multimap Group owned by Multi Media Mapping Ltd., trading as Multimap / Registered in England & Wales. Company number: 03121505 / VAT registration number: GB 671 8051 34 / Registered address: Beaufort House,Tenth Floor, 15 ST. Botolph Street LONDON EC3A 7EE.
This email is confidential and may be privileged. It may be read, copied and used only by the intended recipient. If you have received it in error, please contact us immediately. ______________________________________________________________________
Colm McMullan schrieb:
I've been working on the wikipedia data overlay on our site, multimap.com. One of the things we'd really love to do is add thumbnail images to complement the summary text for an article that we display in our infoboxes. However, we're finding it difficult to programatically generate the URLs to specific images from the wiki tag.
For example, how do i get from:
[[Image:Asperger kl2.jpg|thumb|[[Hans Asperger]] to http://upload.wikimedia.org/wikipedia/en/5/57/Asperger_kl2.jpg
Special:Filepath:/Asperger kl2.jpg is what you're searching for; it gives you an HTTP redirect with a Location: header. Apply the regexp preg_match("/Location: (.*)\r\n/i",$html,$return); to the HTTP headers, and you'll find in $return[1] the URI to the image. And I would be careful with the thumbnails, only those which are actually used on wiki(m|p)edia are present AFAIR.
Marco
Marco Schuster <marco@...> writes:
Colm McMullan schrieb:
For example, how do i get from:
[[Image:Asperger kl2.jpg|thumb|[[Hans Asperger]] to http://upload.wikimedia.org/wikipedia/en/5/57/Asperger_kl2.jpg
Special:Filepath:/Asperger kl2.jpg is what you're searching for; it gives you an HTTP redirect with a Location: header. Apply the regexp preg_match("/Location: (.*)\r\n/i",$html,$return); to the HTTP headers, and you'll find in $return[1] the URI to the image.
Marco
Marco,
Thanks for getting back to me so quickly, but that doesn't seem to be working for me, am I using it incorrectly?
http://commons.wikimedia.org/w/index.php? \ title=Special%3AFilepath&file=Asperger+kl2.jpg
GET /w/index.php?title=Special%3AFilepath& \ file=Asperger+kl2.jpg HTTP/1.1 Host: commons.wikimedia.org .... Accept-Language: en,en-us;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://commons.wikimedia.org/w/index.php? \ title=Special%3AFilepath&file=Asperger+kl2.jpg
HTTP/1.x 404 Not Found Date: Thu, 07 Feb 2008 23:59:37 GMT Server: Apache X-Powered-By: PHP/5.1.4 Cache-Control: private, s-maxage=0, max-age=0, \ must-revalidate Content-Language: en Vary: Accept-Encoding,Cookie Expires: Thu, 01 Jan 1970 00:00:00 GMT Content-Encoding: gzip Content-Length: 3433 Content-Type: text/html; charset=utf-8 X-Cache: MISS from sq25.wikimedia.org, MISS from knsq2.knams.wikimedia.org, MISS from knsq4.knams.wikimedia.org X-Cache-Lookup: MISS from sq25.wikimedia.org:3128, MISS from knsq2.knams.wikimedia.org:3128, MISS from knsq4.knams.wikimedia.org:80 Via: 1.0 sq25.wikimedia.org:3128 (squid/2.6.STABLE18), 1.0 knsq2.knams.wikimedia.org:3128 (squid/2.6.STABLE18), 1.0 knsq4.knams.wikimedia.org:80 (squid/2.6.STABLE18) Connection: close
Colm McMullan wrote:
Thanks for getting back to me so quickly, but that doesn't seem to be working for me, am I using it incorrectly?
http://commons.wikimedia.org/w/index.php? \ title=Special%3AFilepath&file=Asperger+kl2.jpg
HTTP/1.x 404 Not Found
That's because it's not on commons. Try http://en.wikipedia.org/w/index.php?title=Special%3AFilepath&file=Asperg... Asking for images at en: will provide results for any image that can be shown on en: I.e. both enwiki and commons.
On 08/02/2008, Colm McMullan colm@multimap.com wrote:
Hi,
I've been working on the wikipedia data overlay on our site, multimap.com. One of the things we'd really love to do is add thumbnail images to complement the summary text for an article that we display in our infoboxes. However, we're finding it difficult to programatically generate the URLs to specific images from the wiki tag.
I just asked this question on mediawiki-l. :) Better solution: use the API.
See Brion's post http://lists.wikimedia.org/pipermail/mediawiki-l/2008-February/025961.html.
cheers, Brianna
Brianna Laugher <brianna.laugher@...> writes:
I just asked this question on mediawiki-l. :) Better solution: use the API.
See Brion's post http://lists.wikimedia.org/pipermail/mediawiki-l/2008-February/025961.html.
cheers, Brianna
Hi Brianna, thanks for replying. However the solution isn't really what I'm looking for. I'm parsing through the whole wikipedia dump and I don't want to have to make a HTTP request for each image to find out where it actually is...
So I want to go directly from: Asperger kl2.jp to http://upload.wikimedia.org/wikipedia/en/thumb/5/57/Asperger_kl2.jpg
or whatever URL that provides a thumbnail for that image without any intermediate lookup or step...
Colm McMullan wrote:
Hi Brianna, thanks for replying. However the solution isn't really what I'm looking for. I'm parsing through the whole wikipedia dump and I don't want to have to make a HTTP request for each image to find out where it actually is...
Hmm... if you're working from a dump, couldn't you just check the image table dump to see if the image is listed there? If it's not listed, it's either deleted or from Commons (I think).
On Feb 8, 2008 12:07 PM, Colm McMullan colm@multimap.com wrote:
Hi Brianna, thanks for replying. However the solution isn't really what I'm looking for. I'm parsing through the whole wikipedia dump and I don't want to have to make a HTTP request for each image to find out where it actually is...
If you're working from an enwiki dump, then you could check for the existence of a page in the Image: namespace with the same name as the image name you've extracted from the article text. If there is, then the file is on enwiki, otherwise it's on Commons. This doesn't guarantee anything because images do get moved to Commons or deleted from time to time, but at least the check is a database query and not a HTTP request.
Colm McMullan schreef:
Hi Brianna, thanks for replying. However the solution isn't really what I'm looking for. I'm parsing through the whole wikipedia dump and I don't want to have to make a HTTP request for each image to find out where it actually is...
You can also request this information for multiple images at once:
http://en.wikipedia.org/w/api.php?action=query&titles=Image:Foo.jpg%7CIm... http://en.wikipedia.org/w/api.php?action=query&titles=Image:Albert%20Einstein%20Head.jpg&prop=imageinfo&iiprop=url&iiurlwidth=200
Roan Kattouw (Catrope)
wikitech-l@lists.wikimedia.org