Also, for the curious, the request for dedicated HTML dumps is tracked in this Phabricator task: https://phabricator.wikimedia.org/T182351
On Thu, 3 May 2018 at 15:19, Bartosz Dziewoński matma.rex@gmail.com wrote:
On 2018-05-03 20:54, Aidan Hogan wrote:
I am wondering what is the fastest/best way to get a local dump of English Wikipedia in HTML? We are looking just for the current versions (no edit history) of articles for the purposes of a research project.
The Kiwix project provides HTML dumps of Wikipedia for offline reading: http://www.kiwix.org/downloads/
Their downloads use the ZIM file format, looks like there are libraries available for reading it in many programming languages: http://www.openzim.org/wiki/Readers
-- Bartosz Dziewoński
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l