Hi,
We have more and more ZIM files available for download and we need way
to sort/differentiate them.
One of the concrete problems we face for example is that we provide at
Kiwix a full offline version *with pictures" and one "without pictures".
We have adopted a filename scheme allowing to differentiate them but
this does not appear transparently in the content managers. As a
consequence people are puzzled to see the same content twice (with
different file size of course). Adding a "nopic" tags would allow
readers to react on it adequately.
That's why I'm currently considering the addition of a "Tags" metadata
entry in the ZIM format specification. To be more specific, here:
http://www.openzim.org/wiki/Metadata
Do you think that the right approach?
Regards
Emmanuel
--
Kiwix - Wikipedia Offline & more
* Web: http://www.kiwix.org
* Twitter: https://twitter.com/KiwixOffline
* more: http://www.kiwix.org/wiki/Communication
Saludos Cordiales,
Luego de más de cuatro años, la nueva versión de selección de Artículos
para Venezuela,
<https://es.wikipedia.org/wiki/Wikipedia:Wikipedia_en_CD/Selecci%C3%B3n_de_a…>
se ha culminado y está disponible para probarse. En el proceso de selección
se usó Wikitrust <http://wikitrust.soe.ucsc.edu/>, con la ayuda del
profesor Luca de Alfaro, junto a la validación Manual. He recibido un gran
apoyo personal de Emmanuel Engelhart quién se encargó de la elaboración del
archivo comprimido. Me gustaría agradecer la ayuda que recibí por el
entonces de Laura Fiorucci, Fhaidel, William, entre otros, en el proceso de
validación Manual.
Para quienes no conocen el proyecto, explico brevemente. Se trata de una
selección de Artículos de Wikipedia disponibles para Kiwix
<http://www.kiwix.org/wiki/Main_Page>, el lector de Wikipedia Offline
oficialmente soportado por WMF. Esta selección se realiza con el objetivo
de poder llevar el contenido de Wikipedia y artículos relacionados con
Venezuela a lugares en donde la conexión a internet es insuficiente o
inexistente. Esta selección posee cerca de 9200 artículos (cabe
perfectamente en un CD) y tiene aproximadamente el volumen de 30 tomos de
la enciclopedia británica.
Para la Instalación, por favor, descargar:
Kiwix <http://www.kiwix.org/wiki/Main_Page>
Enlace de Descarga de la Selección
<http://www.mirrorservice.org/sites/download.kiwix.org/zim/wikipedia/wikiped…>
Por favor, agradecería que alguien pueda hacer llegar esto a Wikimedia
Venezuela para que puedan actualizar el enlace aquí
<http://wikimedia.org.ve/wiki/Wikipedia_offline_%28Selecci%C3%B3n_de_Art%C3%…>,
y adicionalmente también esté disponible si alguien desea usarlo en sus
programas de educación o cualquier iniciativa relacionada. Yo no puedo
hacerlo debido a que me encuentro radicado fuera del país, sin embargo,
desde aquí cualquier colaboración de mi parte e el proceso.
Muchas Gracias
Hello,
I was working as a teacher, I and my friend wanted to give offline wikipedia content to children using tablets. We stumbled upon OpenZim format.
After some work over the past two weeks. The tool is finally ready and live at http://www.srik.me/zimbalaka
You can read about it here http://www.arunmozhi.in/blog/zimbalaka-an-openzim-creator/
The source code is at https://github.com/tecoholic/Zimbalaka.
Kindly take a look and I would be happy if people could host it in multiple places in use it, since current hosting on a friend's server ;)
Regards,
Arunmozhi
Might be of interest for this ML!
-------- Forwarded Message --------
Subject: Building our ZIM farm @wmflabs
Date: Thu, 05 Mar 2015 12:34:35 +0100
From: Emmanuel Engelhart <kelson(a)kiwix.org>
To: labs-l(a)lists.wikimedia.org
Hi
Following Yuvi's and Andrew's invitation, I write this email to explain
what I want to do with the wmflabs and share with you my first experiences.
== Context ==
Most of the people still don't have a free and cheap broadband access to
fully enjoy reading Wikimedia web sites. With Kiwix and openZIM, a
WikimediaCH program, we have been working on solutions for almost ten
years to bring Wikimedia content "offline".
We have built a multi-platform reader (www.kiwix.org) and have created
ZIM, a file format to store web site snapshots (www.openzim.org). As a
result, Kiwix is currently the most successful solution to access
Wikipedia offline.
== Problem ==
However, one of the weak point of the project is that we still don't
achieve to generate often enough new fresh snapshots (ZIM files).
Generating ZIM snapshots periodically (we want to provide a new fresh
version each month) of +800 projects needs pretty much hardware resources.
This might look like a detail but it's not. The lack of up-to-date
snapshots brakes many action within our movement to advert more broadly
our offer. As a consequence, too few people are aware about it reported
last Wikimedia readership update. An other side effect is that every few
months, volunteer developers get the idea to build a new offline reader
based on the XML dumps (the only up2date snapshots we provide for now),
which is near to be a dead-end approach.
== Goal ==
Our goal with wmflabs is to have a sustainable and efficient solution to
build, one time a month, new ZIM files for all our projects (for each
project, one with thumbnails and one without). This is at the same time
a requirement for and a part of a broader initiative which has for
purpose to increase the awareness about our "offline offer". Other tasks
are for example, storing all the ZIM files on Wikimedia servers (we
currently only store part of them on download.wikimedia.org) and improve
their accessibility by making them more visible (WPAR has for example
customised their sidebar to provide a direct access).
== Needs ==
Building a ZIM file from a Mediawiki is done using a tool called
mwoffliner which is a scraper based on both Parsoid & Mediawiki APIs.
mwoffliner, after scraping and rewriting content, store them in a
directory. At the end, the content is then self-sufficient (without
online dependencies) and can be then packed in one step in a ZIM file
(using a tool called zimwriterfs).
To run this software you better have:
* A little bit bandwidth
* Low network latency (lots of HTTP requests)
* Fast storage
* Pretty much storage (~100GB per million article)
* Many cores for compression (ZIM, ZIP and picture optimisation)
* Time (~400.000 articles can be dumped per day on a machine)
My guess is that we need a total of around a dozen of VMs and 1.5 TB of
storage.
== Current achievements ==
We have currently 3 x-large VMs in our "MWoffliner" project:
https://wikitech.wikimedia.org/wiki/Nova_Resource:Mwoffliner
With them we are able to provide, one time a month, ZIM for all
instances of Wikivoyage, Wikinews, Wikiquote, Wikiversity, Wikibooks,
Wikispecies, Wikisource, Wiktionary and a few minors Wikipedias.
Here are a few feedbacks about our first months with wmflabs:
* WMFlabs is a great tool, it's fully in the Wikimedia spirit and it works.
* Support on IRC is efficient and friendly
* We faced a little bit instability in December but instances seem to be
stable now
* The Documentation on wikitech wiki seems to be pretty complete, but
the overall presentation is to my opinion too chaotic and stepping-in is
might be easier with a more user-friendly presentation.
* Mediawiki Sementic & OpenStackManager sync/cache/cookie problems are a
little bit annoying
* Overall VM performance looks good although suffering from sporadic
instabilities (bandwidth not available, all the processes stuck in
"kernel time", slow storage).
In general, the wmflabs does the job, we are satisfied and think this is
an adapted solution to our project.
== Next steps ==
We want to complete our effort and mirror the biggest Wikipedia
projects. Unfortunately, we have reached the limits of a traditional
usage of wmflabs. We need more quota and to experiment with the NFS
storage because an x-large instance in not able to mirror more than 1.5
millions of articles at a time. How might that be made possible?
Thank you for your help.
Regards
Emmanuel
--
Kiwix - Wikipedia Offline & more
* Web: http://www.kiwix.org
* Twitter: https://twitter.com/KiwixOffline
* more: http://www.kiwix.org/wiki/Communication
FYI
-------- Messaggio inoltrato --------
Oggetto: [Xmldatadumps-l] Your comments needed (long term dumps rewrite?)
Data: Thu, 19 Feb 2015 12:30:01 +0200
Mittente: Ariel Glenn WMF <ariel(a)wikimedia.org>
A: Xmldatadumps-l(a)lists.wikimedia.org
The MediaWiki Core team has opened a discussion about getting more
involved in and maybe redoing the dumps infrastructure. A good starting
point is to understand how folks use the dumps already or want to use
them but can't, and some questions about that are listed here:
https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_Core_Team/Backlog/Improv…
I've added some notes but please go weigh in. Don't be shy about what
you do/what you need, this is the time to get it all on the table.
Ariel
Hi,
We have just released Kiwix for Android v1.93. This version mainly fixes
a ugly bug making article and images partly broken on Android 5.X
"Lollipop".
We are really happy to have finally fixed that bug considering the
raising number of devices using this version of Android: ~10%.
This new version if of course available on the Googe Play store, but the
APK can also be downloaded directly via HTTP:
http://download.kiwix.org/bin/kiwix.apk
Regards
Emmanuel
--
Kiwix - Wikipedia Offline & more
* Web: http://www.kiwix.org
* Twitter: https://twitter.com/KiwixOffline
* more: http://www.kiwix.org/wiki/Communication
Hi,
A few people are currently working hard to release a first version of
Kiwix for iOS. Most of the Kiwix users don't have this kind of devices
which is more dedicated to wealthy people, but this is important to
support iOS to have a "complete" portfolio.
Unfortunately, we currently suffer of a lack of Apple devices:
* devices (iPhone or iPad) with iOS7 or greater
* OSX computers (for development purpose) not more than 3 years old.
If you want to make an hardware donation or have a second hand devices
you might sell for a cheap price, please let us - this would really help
us a lot!
Thank you in advance for your help.
Kind regards
Emmanuel
--
Kiwix - Wikipedia Offline & more
* Web: http://www.kiwix.org
* Twitter: https://twitter.com/KiwixOffline
* more: http://www.kiwix.org/wiki/Communication