http://bugs.openzim.org/show_bug.cgi?id=18
Summary: debian needs an init.d script
Product: openZIM
Version: unspecified
Platform: PC
OS/Version: Linux
Status: NEW
Severity: enhancement
Priority: P5
Component: zimreader
AssignedTo: tommi(a)tntnet.org
ReportedBy: andyr(a)wizzy.com
CC: dev-l(a)openzim.org
Estimated Hours: 0.0
One attached.
--
Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.openzim.org/show_bug.cgi?id=21
Summary: Zimlib should allow to get/unpack only a part of a
content
Product: openZIM
Version: unspecified
Platform: PC
OS/Version: Windows
Status: NEW
Severity: enhancement
Priority: P5
Component: zimlib
AssignedTo: tommi(a)tntnet.org
ReportedBy: emmanuel(a)engelhart.org
CC: dev-l(a)openzim.org
Estimated Hours: 0.0
The zimlib needs to fully unpack a content before giving delivering it to a
third part software.
This has many disadvantages especially if the content is big, a video for
example:
* This will need pretty much memory
* This will take time
* You do not have a random access (necessary to seek in an HTML5 video)
It would be a really good usability improvement to have a method which delivers
only a part of any content.
--
Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Hi everyone, we need your help.
We are from Python Argentina, and we are working on adapting our
cdpedia project to make a DVD together with educ.ar and Wikimedia
Foundation, holding the entire Spanish Wikipedia that will be sent
soon to Argentinian schools.
Hernán and Diego are the two interns tasked with updating the data
that cdpedia uses to make the cd (it currently uses a static html dump
dated June 2008), but they are encountering some problems while trying
to make an up to date static html es-wikipedia dump.
I'm ccing this list of people, because I'm sure you've faced similar
issues when making your offline wikipedias, or because maybe you know
someone who can help us.
Following is an email from Hernán describing the problems he's found.
thanks!
--
alecu - Python Argentina
2010/4/30 Hernan Olivera <lholivera(a)gmail.com>:
Hi everybody,
I've been working on making an up to date static html dump for the
spanish wikipedia, to use as a basis for the DVD.
I've followed the procedures detailed in the pages below, that were
used to generate the current (and out of date) static html dumps:
1) installing and setting up a mediawiki instance
2) importing the xml from [6] with mwdumper
3) exporting the static html with mediawiki's tool
The procedure finishes without throwing any errors, but the xml import
produces malformed html pages that have visible wikimarkup.
We would really need to have a successful import from the spanish xmls
to a mediawiki instance so we can produce the up to date static html
dump.
Links to the info I used:
[0] http://www.mediawiki.org/wiki/Manual:Installation_guide/es
[1] http://www.mediawiki.org/wiki/Manual:Running_MediaWiki_on_Ubuntu
[2] http://en.wikipedia.org/wiki/Wikipedia_database
[3] http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps
[4] http://meta.wikimedia.org/wiki/Importing_a_Wikipedia_database_dump_into_Med…
[5] http://meta.wikimedia.org/wiki/Data_dumps
[6] http://dumps.wikimedia.org/eswiki/20100331/
[7] http://www.mediawiki.org/wiki/Alternative_parsers
(among others)
Cheers,
--
Hernan Olivera
PS: unluckily I didn't write down every step in detail. I did a lot
more tests than what I wrote here. To make a detailed report I'd like
to go thru the procedure again writing down every option (and to check
if I missed something). I'm finishing installing a server just for
this, because this processes take forever and they blocked other tasks
while making this tests.
2009/10/23 Samuel Klein <meta.sj(a)gmail.com>:
> Jimbo - thanks for the spur to clean up the existing work.
>
> All - Let's start by cleaning up the mailing lists and setting a few
> short-term goals :-) It's a good sign that we have both charity and love
> converging to make something happen.
>
> * For all-platform all-purpose wikireaders, let's use
> offline-l(a)lists.wikimedia, as we discussed a month ago in the aftermath of
> Wikimania (Erik, were you going to set this up? I think we agreed to
> deprecate wiki-offline-reader-l and replace it with offline-l.)
>
> * For wikireaders such as WikiBrowse and Infoslicer on the XO, please
> continue to use wikireader(a)lists.laptop
>
>
> I would like to see WikiBrowse become the 'sugarized' version of a reader
> that combines the best of that and the openZim work. A standalone DVD or
> USB drive that comes with its own search tools would be another version of
> the same. As far as merging codebases goes, I don't think the WikiBrowse
> developers are invested in the name.
>
> I think we have a good first cut at selecting articles, weeding out stubs,
> and including thumbnail images. Maybe someone working on openZim can
> suggest how to merge the search processes, and that file format seems
> unambiguously better.
>
> Kul - perhaps part of the work you've been helping along for standalone
> usb-key snapshots would be useful here.
>
>
> Please continue to update this page with your thoughts and progress!
> http://meta.wikimedia.org/wiki/Offline_readers
>
> SJ
>
>
> 2009/10/23 Iris Fernández <irisfernandez(a)gmail.com>
>>
>> On Fri, Oct 23, 2009 at 1:37 PM, Jimmy Wales <jwales(a)wikia-inc.com> wrote:
>> >
>> > My dream is quite simple: a DVD that can be shipped to millions of
>> > people with an all-free-software solution for reading Wikipedia in Spanish.
>> > It should have a decent search solution, doesn't have to be perfect, but it
>> > should be full-text. It should be reasonably fast, but super-perfect is not
>> > a consideration.
>> >
>>
>> Hello! I am an educator, not a programmer. I can help selecting
>> articles or developing categories related to school issues.
>
> Iris - you know the main page of WikiBrowse that you see when the reader
> first loads? You could help with a new version of that page. Madeleine
> (copied here) worked on the first one, but your thoughts on improving it
> would be welcome.
>
>
>
Hi,
I've thought about zimwriter and also talked about it at LinuxTag a little.
The zimwriter has a internal plug in interface, which separates the source of
data from the generation of zim files. There are some implementations of the
source interface. There is the database source, which reads articles from a
database, the full text indexer, which gets the data by reading and indexing a
zimfile and one implementation, which creates a zim file from a search result.
There is still need for more implementations. We need at least a zim file
creator, which merges 2 zimfiles. Also a creator, which reads data from a
directory in the file system. This would obsolete the perl script from
Emmanuel, which writes the files to the database.
The user has to tell the zimwriter, what he wants to do. I can keep on adding
additional options to the zimwriter and also add features to the zimwriter.
Zimwriter will get bigger and bigger and gets more and more options.
My idea is to move the functionality to write a zim file to a library -
libzimwriter and write separate programs for each source implementation. So to
write a zimfile from the database, the user has to use zimwriterdb. To create a
fulltextindex we use zimindexer. To merge zimfiles we have zimmerge. All of
them are quite simple programs, which just interface to the libzimwriter.
As an additional benefit we reduce the dependency from tntdb. Only zimwriterdb
needs tntdb. All other tools do not.
The only problem I see is that I have to break the current command line
interface. I feel that the price is cheap.
Any opinions?
Tommi
Le sam 19/06/10 21:22, "Tommi Mäkitalo" tommi(a)tntnet.org a écrit:
> The user has to tell the zimwriter, what he wants to do. I can keep on
> adding additional options to the zimwriter and also add features to the zimwriter.
> Zimwriter will get bigger and bigger and gets more and more options.
> My idea is to move the functionality to write a zim file to a library -
> libzimwriter and write separate programs for each source implementation.
Ok
> So to write a zimfile from the database, the user has to use zimwriterdb. To
> create a fulltextindex we use zimindexer. To merge zimfiles we have zimmerge. All of
> them are quite simple programs, which just interface to the
> libzimwriter.
Ok
> As an additional benefit we reduce the dependency from tntdb. Only
> zimwriterdb needs tntdb. All other tools do not.
>
> The only problem I see is that I have to break the current command line
> interface. I feel that the price is cheap.
Ok, please keep us informed about the details, that I can keep my toolchain up to date.
Thank you
Emmanuel
Hi all,
three days of LinuxTag are almost past, tomorrow is the last day.
It was very interesting, many people came by that already knew about
openZIM and were specifically talking to us. That was very surprising
for me.
Tomorrow there will be a workshop of Tommi and me, 13:00 - 14:00 in room
"New York 2", so if you know people in the area spread the word.
These days I have also tried to look for new adopters and other
interested parties, so I talked to SugarLabs (the software developers
for the XO computers / OLPC), I visited the WeTab presentation and had
several discussions about Android (but with few results).
Unfortunately Linux4Africa is not present this year, now that I have
managed that we get our booth next to these educational projects.
SkoleLinux is still on my list.
Tweets:
http://www.twingly.com/search?q=%23openzim&sort=published&content=microblog
WeTab: http://wetab.mobi/
My Notes: http://openzim.org/Talk:LinuxTag_2010
Regards,
/Manuel
--
Regards
Manuel Schneider
Wikimedia CH - Verein zur Förderung Freien Wissens
Wikimedia CH - Association for the advancement of free knowledge
www.wikimedia.ch
Le dim 06/06/10 16:29, "Manuel Schneider" manuel.schneider(a)wikimedia.ch a écrit:
> LinuxTag is at the gates, we will meet at Tueday afternoon in Berlin to
> check-in and set up the booth.
> See you there!
Manuel and Tommi, I whish you a good time in Berlin.
Not sure you will have a lot of visitor who are Chinese speakers but I have last night finished a first ZIM with full Wikipedia in Chinese. You may download it there:
http://tmp.kiwix.org/zim/0.9/wikipedia_all_zh_300000+_05_2010_alpha1.zim
.. or have a look to the kiwix-serve online demo here:
http://library.kiwix.org:4209
Cheers
Emmanuel
Dear all!
LinuxTag is at the gates, we will meet at Tueday afternoon in Berlin to
check-in and set up the booth.
Our booth will be #215 in hall 7.2a. There will also be a workshop at
Saturday (June 12th) in room "New York 2" at 1pm - 2pm.
The quickest way to arrive at the fairground or our hotel (Pension
Messe) by train is:
* leave ICE at Berlin Spanday
* take any S-Bahn or RE in direction to Berlin center (eg. S3, S9 or RE
to Jüterborg)
* RE: leave at Jungfernheide
* take S42 south-bound
* S: leave at Westkreuz
* take S41 north-bound
* leave at Messe Nord/ICC
This journey is included in DB tickets with destination code "BERLIN
+City". You don't have to buy tickets for local transportation.
openZIM will provide you with tickets for subsequent rides during the
LinuxTag week.
The rooms are reserved on your names and already paid! Booking reference
is PM3533533.
You might want to add your arrival to our travel coordination page at
https://openzim.org/LinuxTag_2010#Participants_.2F_Itineraries
I suggest that we meet at the hotel and then go to the fairground
quickly to get the setup done.
See you there!
/Manuel
--
Regards
Manuel Schneider
Wikimedia CH - Verein zur Förderung Freien Wissens
Wikimedia CH - Association for the advancement of free knowledge
www.wikimedia.ch
Le jeu 03/06/10 18:59, Tommi Mäkitalo tommi(a)tntnet.org a écrit:
> The solution 1 has really the advantage, that the user can download a
> single zim file and split himself when needed. There is even a unix/linux-tool to
> split files into pieces named split (isn't it nice how intuitive unix
> really is ;-) ).
This seems to me to be the best solution to the >4GB limit of FAT32. But this does
not assigned the problem with option pictures ZIM file etc... But this is maybe
better to consider thus as two different topics.
> It is quite easy to extend the iostream to support multiple files, so that
> it internally join the files into one zim file. We just have to think about
> the interface, how to tell zimlib which files to join.
>
> As you suggested a naming convention is one possible solution. We may even
> use the schema from split. So if you split foo.zim into parts, the parts are
> named foo.zimaa, foo.zimab, foo.zimac and so on. If you tell zimlib to open file
> foo.zim and it is not found, it looks for the parts until it do not find
> any more.
I thing naming convention is not the best solution, I do not like the idea to have
a zimlib depending on filenames. For thousands reasons the ZIM file names may
change.
What about:
*zim::file constructor accepting also directory filename and if directory loads all files inside and merge them (easy for the app. dev. but a little bit tricky)
*zim::file constructor accepting a list of ZIM file chunks (less handful for the app. dev. but clean)
Additional question: ist NTFS not widely supported?
Emmanuel
Hi,
as a member of the Linux Foundation I just received a newsletter which
pointed me at an article on Linux.com concerning the Ben NanoNote.
I was very happy to find out that - beneath many other things in this
detailed article - Vido was mentioned as Wikipedia reader.
Here is the article:
http://www.linux.com/news/embedded-mobile/netbooks/296251:a-review-ben-nano…
See the paragraph "Software":
... A wide variety of user applications are under development by the
community, including some that lean towards embedded device usage (such
as the Rockbox digital music player and Vido offline-Wikipedia reader)
and some flashier options, such as Quake and Doom.
As always I have put a link to this article in our media section on our
main page.
Thanks for your work!
/Manuel
--
Regards
Manuel Schneider
Wikimedia CH - Verein zur Förderung Freien Wissens
Wikimedia CH - Association for the advancement of free knowledge
www.wikimedia.ch