Hi
We have published a new version of MWoffliner: the MediaWiki scraper.
Version 1.8.0 is - like always - available here:
https://www.npmjs.com/package/mwoffliner
This new release contains big improvements in term of performance.
MWoffliner 1.8 does not require anymore zimwriterfs binary and can write
directly on-the-fly offline ZIM files. This means far less mass storage
usage and a number of IO accesses divided by around 5.
Here is the detailed changelog:
1.8.0:
* UPDATE: Write ZIM files directly (Using Libzim) #184
* UPDATE: Removed 'tmp' files and directory #448 #575
* UPDATE: Removed --deflateTmpHTML and --tmpDirectory arguments #575 #576
* UPDATE: Implemented better request backoff #496
* UPDATE: Change file names/paths #278
* UPDATE: Removed --writeHtmlRedirects argument #506
* UPDATE: Removed --localMCS option (automatically detect) #490
* UPDATE: Updated documentation #423
* FIX: Other stability, logging and error handling fixes
All of this has been made possible because a new software piece in
openZIM portfolio: node-libzim. node-libzim is a JavaScript/NodeJS
binding of our ZIM format reference library: libzim. It allows quickly
to read/write ZIM files directly in JavaScript. The code is available in
our code forge at
https://github.com/openzim/node-libzim and of course
available at
npmjs.org:
https://www.npmjs.com/package/@openzim/libzim.
This is the third milestone of a few we have planned with the support
of the WMF. Next one on the list is 1.9 and is planned for end of April.
With 1.9 we want to implement the full support of MediaWiki categories.
Like always, PR and bug reports are welcome at:
https://github.com/openzim/mwoffliner
Regards
Emmanuel
--
Kiwix - Wikipedia Offline & more
* Web:
http://www.kiwix.org
* Twitter:
https://twitter.com/KiwixOffline
* more:
http://www.kiwix.org/wiki/Communication