On Feb 8, 2020, at 1:07 AM, Ayaskant Swain ayaskant.swain@gmail.com wrote:
Thanks Arlo for replying.
Can you please give me some referenc elink to the native parser pf Mediawiki that you have suggested? A native parser will always be the easiest way to cater our need. We want to convert the pages pf our Mediawiki (1.17.5) to either pdf or html pages . All the attachments (images), comments should also come as part of the output file.
When you visit https://<host>/wiki/TestPage, MediaWiki has already parsed the content to HTML for you.
I was suggesting you scrape those pages using wget, Scrapy, HTTrack, or some other tool.
It's also possible this extension works for you, https://www.mediawiki.org/wiki/Extension:DumpHTML