On 3/6/07, David Gerard dgerard@gmail.com wrote:
In a related question, how free is PDF? The format may not be altered and still called "PDF", but everyone including the FSF can live with that. There's plenty of non-Adobe PDF creators and readers. What there is a lack of is editors, but it's essentially a write-once format in any case.
PDF is a mystery meat format. You never know exactly whats going to be inside, so you can't speak about PDF as a whole coherently.
There is a subset of PDF which is free by almost any reasonable standard (as you note, most people don't regard the name control issue to be material).
Subset PDF is effectively equal to gzipped postscript. It's been around forever, and shouldn't have any serious problems. With subset PDF I think the biggest freedom related risk is non-free authoring tools smuggling in non-free content without the authors knowledge. (We see this with SVG too, Illustrator loves to embed non-free fully hinted TTF fonts in the SVGs... At least they are easy to remove since our rasterizer ignores them anyways)
However, PDF can be a *very* non-free format, and I would be surprised if some of the PDF's we host were not pretty much maximally non-free.
First there is the patents issue. Modern commercial PDF plugins contain a JBIG2, and a Jpeg2k codec. Some of the open-source PDF readers can read these, but not all. JBIG2 clearly requires a patent, Jpeg2k is just an ambiguous ugly patent mess. There may be a other ways that non-subset PDF is a patent minefield, this is just the first that comes to mind.
Non-subset PDF also includes encryption and digital restrictions management. The free software PDF tools includes support for the basic (old) PDF encryption and can be easily modified to ignore the restrictions. However, using such modified tools is illegal in the US as is distributing them, under the DMCA.
There are also a TON of features in non-subset PDF that free tools don't support (acrobat forms), or which present other accessibility and security related concerns.
Like I pointed out with flash, to really talk about these things we need to know the application.
For imaged documents especially high resolution bitonal (black/white) documents, DjVu is a compelling alternative which is gaining adoption even outside of the world of free software. We have great support for DjVu in mediawiki, including on the fly transcoding which should dramatically reduce concerns related to client compatibility.
http://commons.wikimedia.org/w/index.php?title=Image%3AMozart_Sonate_%28manu...
Play with the page controls on the right side. :)
For vector illustrations, SVG is a much better choice.
But for PDF as a compatible version of gzipped PS is still a reasonable tool. We need better ability to check files for evilness. Ideally we could have a client side upload tool that knew how to transcode things into acceptable formats (using codecs built into your OS) and which could test for problems like the one I discussed for PDF.