Idle thought...
It used to be that, when an image was clicked, it would generate a pageview of a local File: page (for Commons images, usually a "dummy" local page). On some projects, it would instead go to the corresponding Commons file page.
However, the use of mediaviewer means that a significant fraction of clicks on images will not lead to a new pageview; the user is satisfied with the lightbox, closes it in place, and stays on the page.
In the eswiki case, this seems to be a very convincing explanation for the Commons drop. As a project which has not allowed local uploads for a long time, I believe any image links went straight to Commons rather than to the local File page used on enwiki.
Which leads to an obvious question - on other projects, such as enwiki or frwiki, have we accounted for a drop in *file page* views? What do overall pageview numbers look like using, say, just mainspace/ns0?
I like the explanation. More work would probably have to be done to prove it, but it sounds good. And it illustrates the point that fewer pageviews is not always a bad thing!