http://bugs.openzim.org/show_bug.cgi?id=11
Summary: Dynamic mime-types
Product: openZIM
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P3
Component: zimlib
AssignedTo: tommi(a)tntnet.org
ReportedBy: emmanuel(a)engelhart.org
CC: dev-l(a)openzim.org
Estimated Hours: 0.0
Currently, the ZIM format can only support a limited and predefined set of
document mime-types.
You can get the list of supported mime-types here:
http://www.openzim.org/ZIM_File_Format#Mime_types
Mime-types are represented by a number in a ZIM file, and the mapping is done
statically by the zimlib.
This means than you can not store document with custom mime-type.
This is a problem for me because I have people who use Kiwix and deal with
other mime-types, for example: archives or binaries.
I think, this is a necessary improvement to make this mime-type table dynamic.
I see two solutions :
* The ZIM file creator specify it manually during the creation process.
* It goes automatically (also during the creation process).
--
Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.openzim.org/show_bug.cgi?id=9
Summary: optimize cluster size (small devices limitations)
Product: openZIM
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P5
Component: zimwriter
AssignedTo: tommi(a)tntnet.org
ReportedBy: manuel.schneider(a)wikimedia.ch
CC: dev-l(a)openzim.org
Estimated Hours: 0.0
Small devices don't have much memory. Through built-in article clustering ZIM
optimizes the compression ratio of the article texts.
Currently the cluster size can be determined at ZIM-build-time by the
ZIM-creator.
We must do some research to find out which cluster size is at maximum possible
so the reader still works on a small device and communicate the requirements -
such as "8 MB at least". This cluster size should then be the default at
zimwriter.
If we not do so it can happen that a ZIM-creator chooses as cluster size which
does not work on small devices, so these files can not be read there.
--
Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.openzim.org/show_bug.cgi?id=10
Summary: caching on low memory devices
Product: openZIM
Version: unspecified
Platform: PC
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P5
Component: zimreader
AssignedTo: tommi(a)tntnet.org
ReportedBy: mirko(a)qi-hardware.com
CC: dev-l(a)openzim.org
Estimated Hours: 0.0
We have a problem with memory usage on the Ben NanoNote. It comes with 32MB
SDRAM. The ZimReader uses more than that. We get an "out of memory" kernel
message.
Our guess is, that it is connected to the caching that is being done.
It would be great if there was an option to enable a "low memory" version so
that the ZimReader runs smoothly on devices such as the Ben.
--
Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Hey,
I ported OpenZIM and it's dependencies to OpenWrt (not committed yet) -
mainly, to get it running on the embedded device "Ben NanoNote" of
qi-hardware.
Unfortunately OpenZIM seems to consume an amount of memory the Ben isn't
able to serve (it has 32MB of RAM).
Even with a really minimalistic system (lightweight kernelconfig +
busybox-based rootfs, having just a shell) and accessing the zimreader
via a webbrowser running on another machine, OpenZIM is getting killed
by the kernels out-of-memory killer after a few clicks.
Is there a way of configuring / adjusting the zimreader to reduce memory
consumption to get it running smoothly and stable on device such the ben
NanoNote?
Thanks a lot in advance - great work!
Have a nice weekend,
mirko
--
This email address is used for mailinglist purposes only.
Non-mailinglist emails will be dropped automatically.
If you want to get in contact with me personally, please mail to:
mirko.vogt <at> nanl <dot> de
Hey,
is there a reliable way of getting a signal from the ZimReader
respective webserver, whether it got started successfully?
Thing is, for now I start the ZimReader, wait a few seconds and then
will start my webbrowser, out of the hope, the ZimReader got up in time.
My idea would be sth. like, having an option to be able to start the
ZimReader in daemon-mode which is forking (and therefore
"backgrounding") when it got up successfully.
That way, e.g. this would possible:
< ZimReader /foo/bar.zim && webbrowser http://127.0.0.1:8080 >
Thanks in advance,
mirko
--
This email address is used for mailinglist purposes only.
Non-mailinglist emails will be dropped automatically.
If you want to get in contact with me personally, please mail to:
mirko.vogt <at> nanl <dot> de
Hi Asaf,
Le lun 28/09/09 14:52, Asaf Bartov asaf.bartov(a)gmail.com a écrit:
> In trying to retrieve the images for the Hebrew Wikipedia ZIM Im
> making, I tried running Emmanuels script MIRRORMEDIAWIKIPAGES.PL. My
> command line was this:
>
> ./MIRRORMEDIAWIKIPAGES.PL --SOURCEHOST=HE.WIKIPEDIA.ORG [1]
> --DESTINATIONHOST=LOCALHOST --USEINCOMPLETEPAGESASINPUT --SOURCEPATH=W
bizarre... I do not have this issue by me. Please checkout a recent version of the svn to be sure.
This should also theoriticaly not happen, because at this moment you only have in the worse case twice the list of article in memory.
> After working for more than 20 hours, and still in the stage of
> populating the @pages with incomplete pages, it aborted with "out of
> memory". The machine has 4GB physical memory, and the last time I
> checked -- several hours before it aborted -- the script was consuming
> 3.6GB.
>
> Is there a way to do this in several large chunks, without specifying
> each individual page? How do you do it?
I do like you and never have had this issue.
The script memory usage should never increase.
By me with 8GB memory, it uses almost 5%.
But, if you want to get only the pictures there is an alternative way:
* use ./listAllPages to get the list of all pages
* use ./listDependences.pl to get the missing images (maybe you also need to use grep to filter out a few templates)
* at the end use ./mirrorMediawikiPages.pl with in STDIN your list of images
Regards
Emmanuel
Hello again,
In trying to retrieve the images for the Hebrew Wikipedia ZIM I'm making, I
tried running Emmanuel's script *mirrorMediawikiPages.pl*. My command line
was this:
*./mirrorMediawikiPages.pl
--sourceHost=he.wikipedia.org--destinationHost=localhost
--useIncompletePagesAsInput --sourcePath=w
*
After working for more than 20 hours, and still in the stage of populating
the @pages with incomplete pages, it aborted with "out of memory". The
machine has 4GB physical memory, and the last time I checked -- several
hours before it aborted -- the script was consuming 3.6GB.
Is there a way to do this in several large chunks, without specifying each
individual page? How do you do it?
Thanks in advance,
Asaf Bartov
Wikimedia Israel
--
Asaf Bartov <asaf(a)forum2.org>
Hi, everyone.
Can someone explain what procedure you use to add (some) images to the dump
before packaging a ZIM file?
I am preparing a fresh Hebrew Wikipedia ZIM file, and would like to test the
integration of images as well as recent improvements to Kiwix.
So far, I found Wikix <http://meta.wikimedia.org/wiki/Wikix>, and no
specific instructions on including images in Emmanuel's ZIM-building script,
so I'm guessing that if images are downloaded and integrated into the local
Wikipedia server it's enough?
My questions are:
1. How do you know which images are referenced by the local Wikipedia? I
see Wikix extracts this information into a bunch of shell script files, but
maybe there's another/better way? What do you use?
2. Given a list of images, what is the best way to retrieve them without
pounding the Wikimedia servers? Is there an accepted way? Should I
coordinate it with anyone? The shell scripts generated by Wikix don't seem
to make any provision for delays or anything, and I'm afraid running them
would get me banned. Again, what do you use?
3. What if we want only the thumbnail/low-res version incorporated in the
articles themselves, and not the full resolution version from commons etc.?
4. Once you have a local tree of the image files (in directories 0, 1,
2,..., f), what else do you need to do to get Emmanuel's buildZim....pl
script to include them in the ZIM file?
Many thanks in advance,
Asaf Bartov
Wikimedia Israel
--
--
Asaf Bartov <asaf(a)forum2.org>
http://bugs.openzim.org/show_bug.cgi?id=4
Manuel Schneider <manuel.schneider(a)wikimedia.ch> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dev-l(a)openzim.org
AssignedTo|manuel.schneider@wikimedia. |tommi(a)tntnet.org
|ch |
--
Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Hi
I have made today a new small bug report:
http://bugs.openzim.org/show_bug.cgi?id=8
I think we do not deal enough with our bugzilla because we are not triggered if new bugs are open.
I propose that:
* every new entry (bug or feature request) generates an notification email to this ML,
* Tommi is the default owner for every ticket.
Regards
Emmanuel