I've written a simple MediaWiki extension that uses an instance of the W3C Validator service (via the Services_W3C_HTMLValidator http://pear.php.net/package/Services_W3C_HTMLValidator PEAR package) to validate SVG images hosted on a wiki. It is meant to replace the current system on Commons, that relies on individual contributors adding templates (e.g. InvalidSVG https://commons.wikimedia.org/wiki/Template:InvalidSVG) by hand to file description pages. It exposes a simple API (and a Scribunto module as well) to get the validation status of existing SVG files, can emit warnings when trying to upload invalid ones, and is well integrated with MediaWiki's native ObjectCache mechanism. I'm in the process of publishing the code, but have some questions I think the community could help me answer.
* Given that the W3C Validator can also parse HTML files, would it be useful to validate wiki pages as well? Even if sometimes the validation errors appear to be caused by MediaWiki itself, they can also depend on malformed templates. * Does storing the validation status of old revisions of images (and/or articles) make sense? * Do you think the extension should use the extmetadata property of ApiQueryImageInfo instead of a its own module? * Is it advisable to store validation data permanently in the database?