> The first three we can get from pretty much either API, or extract directly from
> a dump file. The latter is eluding us though, for two reasons. One is that a
> file, like 30C3_Commons_Machinery_2.jpg, is actually in the /b/ba/ directory -
> but where this /b/ba/ comes from (a hash?) is unclear to us now, and it's not
> something we find in the dumps - though we can get it from one of the APIs.

Yes, /b/ba ist based on the first two digits of the MD5 hash of the title:

md5( "30C3_Commons_Machinery_2.jpg" ) -> ba253c78d894a80788940a3ca765debb

But this is "arcane knowledge" which nobody should really rely on. The canonical
way would be to use
https://commons.wikimedia.org/wiki/Special:Redirect/file/30C3_Commons_Machinery_2.jpg

Which generates a redirect to
https://upload.wikimedia.org/wikipedia/commons/b/ba/30C3_Commons_Machinery_2.jpg

To get a thumbnail, you can directly manipulate that URL, by inserting "thumb/"
and the desired size in the correct location (maybe Special:Redirect can do that
for you, but I do not know how):

https://upload.wikimedia.org/wikipedia/commons/thumb/b/ba/30C3_Commons_Machinery_2.jpg/640px-30C3_Commons_Machinery_2.jpg

If I am not mistaken you can use thumb.php to get the needed thumb?
<https://commons.wikimedia.org/w/thumb.php?f=Example.jpg&width=100>

(That’s what I used in my CommonsDownloader [1])

[1] <https://github.com/Commonists/CommonsDownloader/blob/master/commonsdownloader/thumbnaildownload.py>

Hope that helps,
--
Jean-Frédéric