On 06/02/2008, Brion Vibber brion@wikimedia.org wrote:
While reviewing some other code, I went in and started ripping up some of the file type & validity checks in MediaWiki's upload system, as they've been driving me nuts for some time. One quick subproject was tossing in an XML well-formedness check for SVG files. For the curious, here's a report on the invalid files I encountered while testing this with files from Commons: http://meta.wikimedia.org/wiki/SVG_validity_checks
This is worth noting on mediawiki.org, really.
Of particular interest are invalid SVGs created by editing tools. I have a Bastard SVG From Hell I like to throw at things (I hope to have a copy I can release soon ;-) ) created by OmniGraffle. The W3C validator hates it. Inkscape, rsvg, Safari, WebKit, Opera, Firefox and Minefield all misrender it to a greater or lesser degree. (I've yet to throw it at Batik.) But it's an SVG created by an editing program in current use ...
I was surprised to see a bad SVG from Inkscape - does opening and saving it in the current stable Inkscape sanitise it?
How sanitisable are the bad SVGs you found? How automatable would a sanitisation process be, e.g. from a command-line invocation of Inkscape?
- d.