On 11/13/2013 08:10 AM, Tyler Romeo wrote:
On Wed, Nov 13, 2013 at 12:45 AM, Erik Moeller erik@wikimedia.org wrote:
Most likely, we'll end up using Parsoid's HTML5 output, transform it to add required bits like licensing info and prettify it, and then render it to PDF via phantomjs, but we're still looking at various rendering options.
I don't have anything against this, but what's the reasoning? You now have to parse the wikitext into HTML5 and then parse the HTML5 into PDF.
We are already parsing all edited pages to HTML5 and will also start storing (rather than just caching) this HTML very soon, so there will not be any extra parsing involved in the longer term. Getting the HTML will basically be a request for a static HTML page.
Gabriel