Hi Tisza.
thanks a lot for your answer. A Chromium based solution is certainly one of the best you can get. Its cheap in computational resources and updates should be available for a long time. Sorry for creating unnecessary work for you. I just figured out from the following link that the new renderer was based on mwlib and reportlab. But that dates back to April 2018 and was last updated in August 2018 and obviously this information is outdated now.
https://www.mediawiki.org/wiki/Reading/Web/PDF_Functionality
also these two pages seem to contain the same outdated information.
https://www.reportlab.com/opensource/
https://www.reportlab.com/casestudies/wikipedia/
Yours Dirk
On 3/17/19 8:14 PM, Tisza Gergő wrote:
There are two different PDF renderer tools: the single page PDF renderer ("Download as PDF" link in the sidebar, via the ElectronPdfService extension [1]) and the article collection renderer ("Create a book" link, via the Collection extension [2]).
The single page renderer is today served by a tool called Electron [3]; it's in the process of being replaced by a new tool called Proton [4]. These are both node.js services which manage headless Chromium instances - which means the actual rendering engine will stay the same, so no user-facing changes are expected. The switch is for operational reasons: Electron crashes periodically, and has been written before the Chromium project provided an official library for remote-controlling headless browsers, so it didn't take advantage of that. Proton is currently getting mirrored traffic (ie. it is deployed in production for testing purposes, and both it and Electron render the PDF files requested by users, but only the one from Electron is returned). The collection renderer used to be served by a tool called OCG [5], which has been decommissioned about a year ago. It also functions as a frontend to PediaPress [6], who create print-on-demand books of Wikipedia content. They use mwlib internally (and are the main developers of it). I believe they plan to provide PDF download functionality eventually.
So in short, the WMF is not involved with mwlib development, you should probably contact PediaPress (see [7]) if you have questions about that. The PDF renderer project at the WMF is not related to mwlib and not affected by the Python 2 life cycle.
[1] https://www.mediawiki.org/wiki/Extension:ElectronPdfService [2] https://www.mediawiki.org/wiki/Extension:Collection [3] https://www.mediawiki.org/wiki/Electron [4] https://www.mediawiki.org/wiki/Proton [5] https://www.mediawiki.org/wiki/Offline_content_generator [6] https://meta.wikimedia.org/wiki/Book_tool/Help/Books/Frequently_Asked_Questi... [7] https://pediapress.com/code/