I have identified an organization which is willing to spend up to
about EUR 10,000 on adding support for exporting MediaWiki pages as
PDF files, and improving document management for documents consisting
of multiple pages.
My current thinking is that the functionality implemented, as a
minimum, would be as follows:
a) Using an extension, integrate of a "PDF link" on any wiki page
which would call an external library like HTMLDOC on a single wiki
page
b) Support filters on the rendered HTML (replacing image thumbnails
with high resolution images, filter content by regular expression,
etc.), and revision filters (export last revision edited by user on
whitelist Y, or approximating currentdate-Z)
c) Create a "PDF basket" UI which makes it possible to compile a PDF
from multiple pages easily (and rearrange the pages in a hierarchy).
The resulting structures could potentially also be stored as wikitext,
using a new <structure> extension tag, so that they can be used both
by individuals compiling PDFs for personal use, and by groups
collaborating on complex documents.
Possibly some budget could also be allocated for improving the
external PDF library used, especially if we can allocate additional
funds for this project.
I'd like to request comments on this approach, specifically:
- Besides HTMLDOC, do you know a good (X)HTML-to-PDF library which
could be used for this purpose?
- Within this budget, do you believe an alternative approach which
utilizes an intermediate format is viable (e.g.
wiki-to-Docbook-to-PDF), given the complexity of the MediaWiki syntax,
its various extensions, and the need to keep up with parser changes?
- If you are a developer, would you be interested in working on this
project, and available to do so? (If so, please contact me privately.)
Any other comments would also be appreciated.
--
Peace & Love,
Erik
DISCLAIMER: This message does not represent an official position of
the Wikimedia Foundation or its Board of Trustees.