On Fri, Sep 10, 2010 at 2:44 PM, Roan Kattouw roan.kattouw@gmail.com wrote:
Both approaches seem complicated so maybe a different dump would be helpful:
Page id --> List of [ Image id | real url | type (original | dim_xy | thumb) | license ]
http://commons.wikimedia.org/w/api.php?action=query&prop=imageinfo&i...
Returns image URL, width, height and thumbnail URL for a 200px thumbnail.
Thanks, this may be useful. So let's say I want to get all images for the Ant page, the steps will be:
1. Parse the Ant page wikitext and get all Image: links
2. For every image link get it's commons page id (Can I issue the above query using the title ids instead on number ids ? . If not, then use the commons repository to map image title to number id)
3. Issue a query like the one you detail above (but the results don't show license info !).
Still, I think having a small dump with metadata is better than sending a lot of api queries
thanks