http://bugs.openzim.org/show_bug.cgi?id=14
Summary: zim-check
Product: openZIM
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P5
Component: zimwriter
AssignedTo: tommi(a)tntnet.org
ReportedBy: emmanuel(a)engelhart.org
CC: dev-l(a)openzim.org
Estimated Hours: 0.0
We need to have a way to check the quality of of zim file.
Should be at least checked:
* (WARNING) has a welcome page
* (ERROR) broken local HTML links
* (WARNING) redundant content
* ...
--
Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.openzim.org/show_bug.cgi?id=15
Summary: zim::File getUuid() should return an HEX encoded MD5
hash
Product: openZIM
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P5
Component: zimlib
AssignedTo: tommi(a)tntnet.org
ReportedBy: emmanuel(a)engelhart.org
CC: dev-l(a)openzim.org
Estimated Hours: 0.0
It returns currently the binary value which is not so easy to handle with (can
not use standard string manipulation functions and can not easily save it in a
file).
It would be better to return the same value but in HEX format as a NULL
terminated string. This would avoid (I'm sure) each client using the zimlib to
recode this conversion.
--
Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.openzim.org/show_bug.cgi?id=11
Summary: Dynamic mime-types
Product: openZIM
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P3
Component: zimlib
AssignedTo: tommi(a)tntnet.org
ReportedBy: emmanuel(a)engelhart.org
CC: dev-l(a)openzim.org
Estimated Hours: 0.0
Currently, the ZIM format can only support a limited and predefined set of
document mime-types.
You can get the list of supported mime-types here:
http://www.openzim.org/ZIM_File_Format#Mime_types
Mime-types are represented by a number in a ZIM file, and the mapping is done
statically by the zimlib.
This means than you can not store document with custom mime-type.
This is a problem for me because I have people who use Kiwix and deal with
other mime-types, for example: archives or binaries.
I think, this is a necessary improvement to make this mime-type table dynamic.
I see two solutions :
* The ZIM file creator specify it manually during the creation process.
* It goes automatically (also during the creation process).
--
Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Hi,
I talked with the xz developer in IRC and he helped me to reduce memory usage.
The problem was, that compressing with lzma with 9 uncompressing needs about
65MB of RAM. The needed memory is in fact dependend of used compression level.
Using level 3 reduced the memory usage to about 1,5MB, that it works now on
the NanoNote. There is a additional flag LZMA_PRESET_EXTREME, which makes
compressing slower but compression ratio better. Using 3 + extreme results in
a file, which works on Nanonote while the size is almost identical with 9 or
bzip2.
Uncompressing bzip2 chunks take about 2 seconds while lzma needs 0,5 seconds
on the nanonote. So we keep the size while improving speed with the factor
4!!!
Have a good Year.
Tommi
Hi
This is a really good news :)
The xz lib can not be compiled with MS cl.exe and with every non C99 compatible C compiler.
But, an already compiled library (with mingw) should be usable with cl.exe (please do not ask me how they achieve to do that).
These infos are in the README
Have to test, but I have no timeline for that yet. Hope have time during january for that task.
Emmanuel
Le jeu 31/12/09 16:32, "Manuel Schneider" manuel.schneider(a)wikimedia.ch a écrit:
> What should I say?
>
> Great! Very good work done, Tommi!
>
> So does this mean we can dismiss all other compression algorithms?
>
> What about the streaming mode when uncompressing? You talked about that
>
as you said we could improve memory usage on small devices.
>
> Greetings and a good year to all of us!
>
> /Manuel
>
>
> Am 31.12.2009 16:03, schrieb Tommi M�kitalo:
> > Hi,
> >
> > I talked with the xz developer in IRC and he helped
> me to reduce memory usage.
>
> > The problem was, that compressing with lzma with 9
> uncompressing needs about
> 65MB of RAM. The needed memory is in fact dependend
> of used compression level.
> Using level 3 reduced the memory usage to about
> 1,5MB, that it works now on
> the NanoNote. There is a additional flag
> LZMA_PRESET_EXTREME, which makes
> compressing slower but compression ratio better.
> Using 3 + extreme results in
> a file, which works on Nanonote while the size is
> almost identical with 9 or
> bzip2.
> >
> > Uncompressing bzip2 chunks take about 2 seconds
> while lzma needs 0,5 seconds
> on the nanonote. So we keep the size while improving
> speed with the factor
> 4!!!
> >
> > Have a good Year.
> >
> > Tommi
> >
> _______________________________________________
> dev-l mailing list
> > dev-l@openz
> im.org
> https://intern.openzim.org/mailman/listinfo/dev-l
> >>
>
> --
> Regards
> Manuel Schneider
>
> Wikimedia CH - Verein zur F�rderung Freien Wissens
> Wikimedia CH - Association for the advancement of free knowledge
> www.wikimedia.ch
_______________________________________________
> dev-l mailing list
> dev-l@openz
> im.orghttps://intern.openzim.org/mailman/listinfo/dev-l
>
>
Hallo Lukas,
vielen Dank für Deine Software http://series60-remote.sourceforge.net/ -
ich habe Sie auf meinem PC unter Windows 7 und auf meinem Nokia E71-1
installiert und alles hat auf Anhieb funktioniert. Du kannst dieses
Gerät also auf Deiner Webseite als "OK" markieren (s. Screenshot im Anhang).
Nun zu einem anderen Anliegen:
Wir von openZIM würden gerne unsere zimlib nach Symbian portieren. Sie
ist in C++ geschrieben und wir haben uns daher auf unserem
Entwicklungsserver in eine VM mit Windows das Symbian SDK installiert.
Allerdings bin ich auf der Suche nach Dokumentation, Anleitungen etc. da
wir noch nie auf Symbian entwickelt haben.
Da Du aktiv Software unter Symbian entwickelst denke ich, dass Du sicher
über entsprechende Dokumentationen und Erfahrungen verfügst um uns
entsprechende Anhaltspunkte zu geben?
Kontakt zu einer entsprechenden Entwicklergemeinschaft etc. wären für
uns sehr wertvoll, um in diesem für uns noch völlig unbekannten Gebiet
die erste Schritte machen zu können.
Vielen Dank für Deine Hilfe und Grüsse,
Manuel Schneider
--
Regards
Manuel Schneider
Wikimedia CH - Verein zur Förderung Freien Wissens
Wikimedia CH - Association for the advancement of free knowledge
www.wikimedia.ch
Hi,
The xz (lzma) library is ported to nanonote. And also my zim benchmark
program.
Lzma seems to be more problematic than expected. Due to memory limitations
zimlib is not able to decompress any data on the nanonote with xz. Even the
original command line xz tool fails to run. There is a memory limit parameter
in xz and I've played with it. 64M (which is far beyond the available RAM of
32M) is too less for the decompressor and 128M it starts but fails to allocate
memory. I sent a mail to the author if there are other tweaks to reduce memory
consumption in the decompressor.
So I was not able to run any benchmarks with lzma. But I tested with bzip2.
The device needs about 2 seconds to decompress the data. With the benchmark
program it is no problem to increase the cluster cache size to 8, so that it
caches the last 8 clusters. I think a default cluster cache size of 5 would be
reasonable for that device.
Tommi
It seems to be that this is actually a non-solvable problem.
So we are lost with some drawback anyway we go.
I think we can agree on our goal: using lzma in the future.
So we have to think a scenario that is less probable to make the drawback a problem.
If someone creates an alternate implementation, he will use a library anyway and not going to implement the algorithm itself.
And my hope is that xz is already portable (actually, I can't think of a reason why it shouldn't be as it is less hardware dependant as anything else), so the transition period will be quite short and less hurting than our compatibility break we did just now.
In the end we have to look on what we have: zimreader on Unices, Kiwix on Windows and Ben NanoNote. If we get it running there it will be fine and usable. That way most used platforms are backed and when porting to further platform we or the adopters will have bigger problems to solve when porting zimlib than porting xz.
So I am perfectly confident with that solution.
Could someone from OpenWRT, Qi Hardware and Wikimedia speak up if you see another problems?
/Manuel
-- Urspr. Mitt. --
Betreff: Re: [openZIM dev-l] Update zim file format in trunk
Von: Tommi Mäkitalo <tommi(a)tntnet.org>
Datum: 26.12.2009 20:55
Am Samstag, 26. Dezember 2009 19:46:36 schrieb Manuel Schneider:
> Hi,
>
> I apreciate to see this enthusiastic discussion.
>
> My suggestion is:
> * we have lzma and bzip2 support in zimlib
> * the zimreader can use both algorithms
> * as long as we didn't approve lzma to work on all platforms and systems,
> the zimwriter uses bzip2 by default * as soon as we can approve lzma we
> switch the zimwriter's default to lzma * for experiments, development,
> porting etc. someone who knows what he does he can use lzma anyway * after
> the approval of lzma we will wait for some time until we are sure that no
> more bzip2-compressed are being made, we drop bzip2-support completely
>
> This way we can make the adoption of lzma smooth. The reader will still be
> backwards compatible during the transition period.
>
>
> Have a nice Christmas,
>
> /Manuel
>
Sounds reasonable, but the disadvantage is, that it will be more difficult to
create a alternative implementation. At least in the transition period. If
someone wants to create a zim implementation in C# or Java, he must implement
both decompression algorithms. Otherwise he will not be able to read all zim
files.
Tommi
_______________________________________________
dev-l mailing list
dev-l(a)openzim.org
https://intern.openzim.org/mailman/listinfo/dev-l
Hi,
I apreciate to see this enthusiastic discussion.
My suggestion is:
* we have lzma and bzip2 support in zimlib
* the zimreader can use both algorithms
* as long as we didn't approve lzma to work on all platforms and systems, the zimwriter uses bzip2 by default
* as soon as we can approve lzma we switch the zimwriter's default to lzma
* for experiments, development, porting etc. someone who knows what he does he can use lzma anyway
* after the approval of lzma we will wait for some time until we are sure that no more bzip2-compressed are being made, we drop bzip2-support completely
This way we can make the adoption of lzma smooth. The reader will still be backwards compatible during the transition period.
Have a nice Christmas,
/Manuel
-- Urspr. Mitt. --
Betreff: Re: [openZIM dev-l] Update zim file format in trunk
Von: Tommi Mäkitalo <tommi(a)tntnet.org>
Datum: 26.12.2009 19:35
Am Samstag, 26. Dezember 2009 18:49:20 schrieb Emmanuel Engelhart:
> Tommi Mäkitalo a écrit :
> > Am Samstag, 26. Dezember 2009 14:19:26 schrieb Emmanuel Engelhart:
> >> Tommi Mäkitalo a écrit :
> >
> > ...
> >
> >>> The next step is to remove support for zlib and bzip2.
> >>>
> >>> Currently I'm working on porting the new file format to the nano note.
> >>> Unfortunately I'm stuck with porting xz (the lzma-library) to openwrt.
> >>> There are some difficulties there, but I'm sure, they can be resolved.
> >>> I have already asked for help at the openwrt developer list but have
> >>> not yet received any answers.
> >>
> >> That's why I asked to be careful by removing bzip2 and/or zlib.
> >> It's essential to have at least at any moment a working solution for
> >> every arch/system. I have for example not checked if xz works good with
> >> Windows.
> >>
> >> Emmanuel
> >
> > We want to define a standard format. And if you create zim files with
> > lzma compression we need to make sure, we can read the file on every
> > system, we support. It makes no sense to have some zim files with bzip2
> > compression, which work on windows and other zim files with lzma
> > compresion which don't. We have to commit us to one compression method to
> > prevent fragmentation. Or do you want to offer zim files for ubuntu and
> > zim files for windows and zim files for fedora and zim files for freebsd?
>
> I prefer having *now* a solution with 3 compression algorithms and
> knowing one of them does not always work than having a format which is
> totally unusable in a few use cases.
>
> So, we will try to have lzma working everywhere... but as long as this
> is not sure, it would be preferable IMO to be able to say "look you can
> use bzip2" than "sorry zimlib is not for you, it does not work currently
> for your use case".
>
> We have to choice our disadvantage:
> * Remove bzip2 and having the risk to discover portability issues
> * Keeping the bzip2/gzip dependence a little bit longer
>
> ... in any case this is not so critical... in the worth case we will
> have to work hard to fix the portability issues... and I hope we will
> because we want the zimlib being portable.
>
> Emmanuel
>
How do you want to create your zim files? With lzma? With bzip2? With zlib? If
you sometimes use one and sometimes the other your zim files will work
sometimes on one platform sometimes on the other. Either we use lzma or we
don't. Otherwise we create lzma compresed zim files, which do not work
everywhere.
If you offer lzma compressed zim files for download, not dropping bzip2 support
do not solve any portability issue. We just get a zimlib which does not work
with your zim files. And that's really bad.
We should work on the portability of xz. As I read xz works on windows. It can
be compiled using mingw. So this should not really be an issue.
Tommi
_______________________________________________
dev-l mailing list
dev-l(a)openzim.org
https://intern.openzim.org/mailman/listinfo/dev-l
Hi,
merry chrismas to you all.
I would like to move development of the new file format (zim5) from the zim2-
branch to trunk. I would tag the current trunk under
svn+ssh://openzim.org/svnroot/tags/zim4 and then merge the zim2-branch to
trunk. The trunk code won't be able to read the old zim files any more. You
have to recreate your files.
Any objections?
The next step is to remove support for zlib and bzip2.
Currently I'm working on porting the new file format to the nano note.
Unfortunately I'm stuck with porting xz (the lzma-library) to openwrt. There
are some difficulties there, but I'm sure, they can be resolved. I have already
asked for help at the openwrt developer list but have not yet received any
answers.
Tommi