Hello,
I was working as a teacher, I and my friend wanted to give offline wikipedia content to children using tablets. We stumbled upon OpenZim format.
After some work over the past two weeks. The tool is finally ready and live at http://www.srik.me/zimbalaka You can read about it here http://www.arunmozhi.in/blog/zimbalaka-an-openzim-creator/
The source code is at https://github.com/tecoholic/Zimbalaka.
Kindly take a look and I would be happy if people could host it in multiple places in use it, since current hosting on a friend's server ;)
Regards, Arunmozhi
Hi Arunmozhi,
that's interesting!
There's a javascript tool call https://sourceforge.net/p/kiwix/other/ci/master/tree/mwoffliner/ that can offline sets of wikipedia/mediawiki files. It's used for producing the current mediawiki dumps, and may have some advantages, e.g. that entries are linked. So it might be useful for your project.
I also wonder whether something like this could be done within the OCG tool chain? I.e. where a number of articles are selected with the collection/book interface, and then exported as zim from within mediawiki.
Thanks for sharing! Bjoern
On 15 May 2015 at 14:27, Arun mozhi arun@arunmozhi.in wrote:
Hello,
I was working as a teacher, I and my friend wanted to give offline wikipedia content to children using tablets. We stumbled upon OpenZim format.
After some work over the past two weeks. The tool is finally ready and live at http://www.srik.me/zimbalaka You can read about it here http://www.arunmozhi.in/blog/zimbalaka-an-openzim-creator/
The source code is at https://github.com/tecoholic/Zimbalaka.
Kindly take a look and I would be happy if people could host it in multiple places in use it, since current hosting on a friend's server ;)
Regards, Arunmozhi
Offline-l mailing list Offline-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/offline-l
Hi,
This tool looks interesting. It would be very useful that could create a zim from a category using depth between the categories trees.
2015-05-15 14:44 GMT-03:00 Bjoern Hassler bjohas+mw@gmail.com:
Hi Arunmozhi,
that's interesting!
There's a javascript tool call https://sourceforge.net/p/kiwix/other/ci/master/tree/mwoffliner/ that can offline sets of wikipedia/mediawiki files. It's used for producing the current mediawiki dumps, and may have some advantages, e.g. that entries are linked. So it might be useful for your project.
I also wonder whether something like this could be done within the OCG tool chain? I.e. where a number of articles are selected with the collection/book interface, and then exported as zim from within mediawiki.
Thanks for sharing! Bjoern
On 15 May 2015 at 14:27, Arun mozhi arun@arunmozhi.in wrote:
Hello,
I was working as a teacher, I and my friend wanted to give offline wikipedia content to children using tablets. We stumbled upon OpenZim format.
After some work over the past two weeks. The tool is finally ready and live at http://www.srik.me/zimbalaka You can read about it here http://www.arunmozhi.in/blog/zimbalaka-an-openzim-creator/
The source code is at https://github.com/tecoholic/Zimbalaka.
Kindly take a look and I would be happy if people could host it in multiple places in use it, since current hosting on a friend's server ;)
Regards, Arunmozhi
Offline-l mailing list Offline-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/offline-l
Offline-l mailing list Offline-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/offline-l
Hi,
---- On Sat, 16 May 2015 01:29:44 +0530 Wilfredorwilfredor@gmail.com wrote ----
Hi,
This tool looks interesting. It would be very useful that could create a zim from a category using depth between the categories trees.
It is perfectly doable and I kind of didn't do it, because it is pretty resource intensive, and we don't have resources to run, as one might never know what the user might request. If there is someone who could host it on a better server. Functionality could be added in a day.
Hi,
Could be nice if WMF support this kind of projects. Maybe install it in wmf labs server could work better
2015-05-16 0:13 GMT-03:00 Arun mozhi arun@arunmozhi.in:
Hi,
---- On Sat, 16 May 2015 01:29:44 +0530 Wilfredorwilfredor@gmail.com wrote ----
Hi,
This tool looks interesting. It would be very useful that could create
a zim from a category using depth between the categories trees.
It is perfectly doable and I kind of didn't do it, because it is pretty resource intensive, and we don't have resources to run, as one might never know what the user might request. If there is someone who could host it on a better server. Functionality could be added in a day.
Offline-l mailing list Offline-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/offline-l
Hi,
---- On Fri, 15 May 2015 23:14:45 +0530 Bjoern Hassler bjohas+mw@gmail.com wrote ----
Hi Arunmozhi,
that's interesting!
There's a javascript tool call https://sourceforge.net/p/kiwix/other/ci/master/tree/mwoffliner/ that can offline sets of wikipedia/mediawiki files. It's used for producing the current mediawiki dumps, and may have some advantages, e.g. that entries are linked. So it might be useful for your project.
The thing is, as teachers, we weren't really looking to get the complete wikipedia content. and we cannot afford that big file size also. This was built for situations where we require a subset of the pages on a single topic, which I think might apply to a lot of use case scenarios.
Thank you for pointing out the tool. I am sure it will be useful sometime.
Regards, Arunmozhi
Dear Arunmozhi
On 15.05.2015 15:27, Arun mozhi wrote:
I was working as a teacher, I and my friend wanted to give offline wikipedia content to children using tablets. We stumbled upon OpenZim format.
After some work over the past two weeks. The tool is finally ready and live at http://www.srik.me/zimbalaka You can read about it here http://www.arunmozhi.in/blog/zimbalaka-an-openzim-creator/
The source code is at https://github.com/tecoholic/Zimbalaka.
Kindly take a look and I would be happy if people could host it in multiple places in use it, since current hosting on a friend's server ;)
That's a really nice project: * There is a real need of the community * It's open source * It use a good format.
Thank you very much for sharing your work here.
We always wanted to propose such a tool but did not have invested the necessary resources, so it's really nice to see a first version released by you. We of course volonteer to host it, help to improve it, etc.
The only point which is a little bit "problematic" is the technology to retrieve and manipulate the HTML from Mediawiki. We have been working since almost two years on mwoffliner (mwoffliner can perfectly deal with list of articles), a solution to do that and AFAIK this is the most advanced solution to retrieve Mediawiki content. What do you think about re-using it within Zimbalaka?
Kind regards Emmanuel
Dear Emmanuel,
Thank you so much for writing to me about hosting. I am really happy to hear this.
---- On Sat, 16 May 2015 23:30:17 +0530 Emmanuel Engelhart wrote ----
Dear Arunmozhi
The only point which is a little bit "problematic" is the technology to retrieve and manipulate the HTML from Mediawiki. We have been working since almost two years on mwoffliner (mwoffliner can perfectly deal with list of articles), a solution to do that and AFAIK this is the most advanced solution to retrieve Mediawiki content. What do you think about re-using it within Zimbalaka?
I am learning NodeJS presently after a few hiccups I had with putting together the various parts of this project. I would be more than happy to include mwoffliner and rewrite Zimbalaka. It would take me a few weeks to get the hang of Node, then I can contribute.
Regards, Arunmozhi
Very good work, Arun-ji! Indeed, using mwoffliner internally would be best.
It is certainly possible to host this on Wikimedia servers. Emmanuel, would you lead the process to get it hosted?
And thanks also to the two veteran Tamil Wikipedians who helped with this, Bala and Srikanth. :)
A.
On Sat, May 16, 2015 at 10:39 PM, Arun mozhi arun@arunmozhi.in wrote:
Dear Emmanuel,
Thank you so much for writing to me about hosting. I am really happy to hear this.
---- On Sat, 16 May 2015 23:30:17 +0530 Emmanuel Engelhart wrote ----
Dear Arunmozhi
The only point which is a little bit "problematic" is the technology to retrieve and manipulate the HTML from Mediawiki. We have been working since almost two years on mwoffliner (mwoffliner can perfectly deal with list of articles), a solution to do that and AFAIK this is the most advanced solution to retrieve Mediawiki content. What do you think about re-using it within Zimbalaka?
I am learning NodeJS presently after a few hiccups I had with putting together the various parts of this project. I would be more than happy to include mwoffliner and rewrite Zimbalaka. It would take me a few weeks to get the hang of Node, then I can contribute.
Regards, Arunmozhi
Offline-l mailing list Offline-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/offline-l
On 22.05.2015 03:00, Asaf Bartov wrote:
Very good work, Arun-ji! Indeed, using mwoffliner internally would be best.
It is certainly possible to host this on Wikimedia servers. Emmanuel, would you lead the process to get it hosted?
Yes, I'll, should be online in the next days.
Emmanuel
On 17.05.2015 07:39, Arun mozhi wrote:
Dear Emmanuel,
Thank you so much for writing to me about hosting. I am really happy to hear this.
It's now online: http://zimbalaka.openzim.org/
We really hope to see soon a patch allowing to use mwoffliner do be able to benefit from a better HTML/ZIM quality, but again congrats for this good start point - this is exactly what we always wanted to implement.
Emmanuel
Wow!!
---- On Wed, 03 Jun 2015 21:02:28 +0530 Emmanuel Engelhart wrote ----
It's now online: http://zimbalaka.openzim.org/
We really hope to see soon a patch allowing to use mwoffliner do be able to benefit from a better HTML/ZIM quality, but again congrats for this good start point - this is exactly what we always wanted to implement.
Fantastic. Thank you so much. This gives such a good feel :) Of course, using mwoffliner is on the pipelines.
P.S: Did you run into any problems with the setup documentation? Needs any updates?
Regards, Arunmozhi.
On 03.06.2015 17:44, Arun mozhi wrote:
Wow!!
---- On Wed, 03 Jun 2015 21:02:28 +0530 Emmanuel Engelhart wrote ----
It's now online: http://zimbalaka.openzim.org/
We really hope to see soon a patch allowing to use mwoffliner do be able to benefit from a better HTML/ZIM quality, but again congrats for this good start point - this is exactly what we always wanted to implement.
Fantastic. Thank you so much. This gives such a good feel :) Of course, using mwoffliner is on the pipelines.
Nice
P.S: Did you run into any problems with the setup documentation? Needs any updates?
It's pretty straightforward. I had difficulties but I guess mostly because I'm not familiar with the technologies (around python/supervisord/...). So, nothing to fix IMO regarding the setup, the documentation is good.
Emmanuel
Hi,
---- On Wed, 03 Jun 2015 21:21:23 +0530 Emmanuel Engelhartkelson@kiwix.org wrote ----
P.S: Did you run into any problems with the setup documentation? Needs any updates?
It's pretty straightforward. I had difficulties but I guess mostly because I'm not familiar with the technologies (around python/supervisord/...). So, nothing to fix IMO regarding the setup, the documentation is good.
Thank you :)
Interesting tool. Will give it a try.
Best,
- Enock twitter: @Enock4seth enockseth.github.io | [[User:Enock4seth]]
On Sun, May 17, 2015 at 5:39 AM, Arun mozhi arun@arunmozhi.in wrote:
Dear Emmanuel,
Thank you so much for writing to me about hosting. I am really happy to hear this.
---- On Sat, 16 May 2015 23:30:17 +0530 Emmanuel Engelhart wrote ----
Dear Arunmozhi
The only point which is a little bit "problematic" is the technology to retrieve and manipulate the HTML from Mediawiki. We have been working since almost two years on mwoffliner (mwoffliner can perfectly deal with list of articles), a solution to do that and AFAIK this is the most advanced solution to retrieve Mediawiki content. What do you think about re-using it within Zimbalaka?
I am learning NodeJS presently after a few hiccups I had with putting together the various parts of this project. I would be more than happy to include mwoffliner and rewrite Zimbalaka. It would take me a few weeks to get the hang of Node, then I can contribute.
Regards, Arunmozhi
Offline-l mailing list Offline-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/offline-l