Offline-l

offline-l@lists.wikimedia.org

6 participants
701 discussions

[openZIM dev-l] Fwd: [Toolserver-l] Static dump of German Wikipedia
by Manuel Schneider 24 Sep '10

24 Sep '10

Emmanuel, maybe you can help him with a ZIM file? /Manuel -------- Original-Nachricht -------- Betreff: [Toolserver-l] Static dump of German Wikipedia Datum: Fri, 24 Sep 2010 01:57:55 +0200 Von: Marco Schuster <marco(a)harddisk.is-a-geek.org> Antwort an: toolserver-l(a)lists.wikimedia.org An: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>, toolserver-l <toolserver-l(a)lists.wikimedia.org> Hi all, I have made a list of all the 1.9M articles in NS0 (including redirects / short pages) using the Toolserver; now I have the list I'm going to download every single of 'em (after the trial period tonight, I want to see how this works out. I'd like to begin with downloading the whole thing in 3 or 4 days, if noone objects) and then publish a static dump of it. Data collection will be on the Toolserver (/mnt/user-store/dewiki-static/articles/); the request rate will be 1 article per second and I'll download the new files once or twice a day to my home PC, so there should be no problem with the TS or Wikimedia server load. When this is finished in ~ 21-22 days, I'm going to compress them and upload them to my private server (well, if Wikimedia has an archive server, that 'd be better) as a tgz file so others can play with it. Furthermore, though I have no idea if I'll succeed, I plan on hacking a static Vector skin file which will load the articles using jQuery's excellent .load() feature, so that everyone with JS can enjoy a truly offline Wikipedia. Marco PS: When trying to invoke /w/index.php?action=render with an invalid oldid, the server returns HTTP/1.1 200 OK and an error message, but shouldn't this be a 404 or 500? -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de _______________________________________________ Toolserver-l mailing list (Toolserver-l(a)lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette

1 0

[openZIM dev-l] WG: [Wikitech-l] Parser implementaton for MediaWiki syntax
by Manuel Schneider 23 Sep '10

23 Sep '10

Someone created a MediaWiki parser written in C - please see the mail below. Greetings from Linux-Kongress in Nürnberg, /Manuel Sent via mobile phone. -- Urspr. Mitt. -- Betreff: [Wikitech-l] Parser implementaton for MediaWiki syntax Von: Andreas Jonsson <andreas.jonsson(a)kreablo.se> Datum: 23.09.2010 11:28 Hi, I have written a parser for MediaWiki syntax and have set up a test site for it here: http://libmwparser.kreablo.se/index.php/Libmwparsertest and the source code is available here: http://svn.wikimedia.org/svnroot/mediawiki/trunk/parsers/libmwparser A preprocessor will take care of parser functions, magic words, comment removal, and transclusion. But as it wasn't possible to cleanly separate these functions from the existing preprocessor, some preprocessing is disabled at the test site. It should be straightforward to write a new preprocessor that provides only the required functionality, however. The parser is not feature complete, but the hard parts are solved. I consider "the hard parts" to be: * parsing apostrophes * parsing html mixed with wikitext * parsing headings and links * parsing image links And when I say "solved" I mean producing the same or equivalent output as the original parser, as long as the behavior of the original parser is well defined and produces valid html. Here is a schematic overview of the design: +-----------------------+ | | Wikitext | client application +---------------------------------------+ | | | +-----------------------+ | ^ | | Event stream | +----------+------------+ +-------------------------+ | | | | | | | parser context |<------>| Parser | | | | | | | +-----------------------+ +-------------------------+ | ^ | | Token stream | +-----------------------+ +------------+------------+ | | | | | | | lexer context |<------>| Lexer |<---+ | | | | +-----------------------+ +-------------------------+ The design is described more in detail in a series of posts at the wikitext-l mailing list. The most important "trick" is to make sure that the lexer never produce a spurious token. An end token for a production will not appear unless the corresponding begin token already has been produced, and the lexer maintains a block context to only produce tokens that makes sense in the current block. I have used Antlr for generating both the parser and the lexer, as Antlr supports semantic predicates that can be used for context sensitive parsing. Also I am using a slightly patched version of Antlr's C runtime environent, because the lexer needs to support speculative execution in order to do context sensitive lookahead. A Swig generated interface is used for providing the php api. The parser process the buffer of the php string directly, and writes its output to an array of php strings. Only UTF-8 is supported at the moment. The performance seems to be about the same as for the original parser on plain text. But with an increasing amount of markup, the original parser runs slower. This new parser implementation maintains roughly the same performance regardless of input. I think that this demonstrates the feasability of replacing the MediaWiki parser. There is still a lot of work to do in order to turn it into a full replacement, however. Best regards, Andreas _______________________________________________ Wikitech-l mailing list Wikitech-l(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

2 1

[openZIM dev-l] I hope to be helpful
by Wilfredo Rodriguez 22 Sep '10

22 Sep '10

Good morning. First of all I would like to introduce myself. My name is Wilfredo Rodriguez, I am writing from Venezuela. I am a novice in openZIM, I'm just starting to enter the world of performance multimedia content. OpenZIM I've been using for some time. I think it is important to standardize the format, this will bring benefits to all stakeholders. I do not like talking about myself, but in this case is necessary. I have 14 years experience programming. I specialize in Web 2.0 systems, I have a good level (ordered by experience) of c/c++/Ruby/Java/PHP/perl/lisp and many others. I can also get to do graphic design work fairly good. I can understand very well English and French. I am at your service I hope to be of assistance in whatever they need. -- Lcdo. Wilfredo Rafael Rodríguez Hernández -------------------------------------------------------- msn,googletalk = wilfredor(a)gmail.com cv = http://www.wilfredor.co.cc blog = http://wilfredor.blogspot.com fotos = http://picasaweb.google.com/wilfredor/

2 2

[openZIM dev-l] LZMA2-java
by Asaf Bartov 17 Sep '10

17 Sep '10

Hello, everyone. In the last IRC meeting, I offered to take a shot at porting the LZMA2 decoder from ANSI C to Java, for the benefit of Okawix's effort to ship ZIM support for the Android platform. As Tommi suspected, it is indeed a little complicated. I am, however, making progress, and am committed to finishing the job. However, I have many constraints on my time, and I *cannot* commit to a deadline. Nevertheless, just in case I _am_ able to make it, what is the *latest* you guys at Okawix can receive a working LZMA2 Decoder and still ship on time? If I can't deliver by that deadline, I'll still finish the job, and it'll wait for a future release, and in the meantime may come in handy to other people who might need LZMA2 in Java. There's _nothing working_ to see yet, but if you like, you can follow the development on GitHub: http://github.com/abartov/LZMA2-java <http://github.com/abartov/LZMA2-java>I'll post to this list anyhow once I have something working. Cheers, Asaf -- Asaf Bartov <asaf.bartov(a)gmail.com>

2 3

[openZIM dev-l] Fwd: [Internal-l] 1M Entries of Free Knowledge Go To Africa
by Asaf Bartov 16 Sep '10

16 Sep '10

FYI. Asaf ---------- Forwarded message ---------- From: Asaf Bartov <asaf.bartov(a)gmail.com> Date: 2010/9/16 Subject: Re: [Internal-l] 1M Entries of Free Knowledge Go To Africa Thanks for the congratulations. We will be sure to share some photos and blog posts by the students once those become available, in the coming months. In the meantime, some of you have asked for a linkable version of our press release, so here's a link to a wikified version: http://wikimedia.org.il/Press_release:Wikipedia_Goes_To_Africa Cheers, Asaf Bartov Wikimedia Israel 2010/9/15 KIZU Naoko <aphaia(a)gmail.com> Thank you Asaf for letting us share the news, and congrats for > launching the project! I'm looking forward seeing it success - > possibly you'll give us updates in Haifa next year? :) > > -- Asaf Bartov <asaf.bartov(a)gmail.com>

1 0

[openZIM dev-l] Fwd: [Devnations-l] Fwd: 1M Entries of Free Knowledge Go To Africa
by Manuel Schneider 15 Sep '10

15 Sep '10

This is a report how and where ZIM is used. Nothing directly related to our development, but I'd like you to know and share this as a heads-up ;-) /Manuel -------- Original-Nachricht -------- Betreff: [Devnations-l] Fwd: 1M Entries of Free Knowledge Go To Africa Datum: Wed, 15 Sep 2010 20:15:10 +0200 Von: Asaf Bartov <asaf.bartov(a)gmail.com> Antwort an: Growing Wikimedia in developing nations <devnations-l(a)lists.wikimedia.org> An: devnations-l(a)lists.wikimedia.org Hello, everyone. Below is a report I just sent about Wikimedia Israel's recent work to spread free knowledge in Africa. I bring it to your attention with particular emphasis on the *means, *as I consider it an excellent example of what we had identified as an effective approach in the Proposed Agenda <http://meta.wikimedia.org/w/index.php?title=File:Proposed_Agenda_for_Wikime…> document, namely: reaching out to organizations *already active *in developing nations and *give them Wikimedia content*. Cheers, Asaf Bartov Wikimedia Israel ---------- Forwarded message ---------- From: *Asaf Bartov* <asaf.bartov(a)gmail.com <mailto:asaf.bartov@gmail.com>> Date: Wed, Sep 15, 2010 at 8:08 PM Subject: 1M Entries of Free Knowledge Go To Africa To: "Local Chapters, board and officers coordination (closed subscription)" <internal-l(a)lists.wikimedia.org <mailto:internal-l@lists.wikimedia.org>> Hello, everyone. Here's a rough translation of a press release Wikimedia Israel has issued yesterday, about the "Africa Center" project I had mentioned in an e-mail to this list on June 30th 2010 (see that e-mail for more background on the Africa Center at Ben Gurion University). === START OF PRESS RELEASE === TITLE: Wikipedia Goes To Africa -- Israeli Students to Leave for Humanitarian Work in Africa, Equipped With Portable Static Wikipedia SUBTITLE: Ben-Gurion University's "Africa Center", Wikimedia Israel, and Hamakor cooperate in making free knowledge accessible in Africa The Africa Center at BGU, headed by Dr. Tamar Golan, annually sends a group of students on a three-month humanitarian expedition to developing countries in Africa. This year's group is going to the Repbulic of Benin and the Republic of Cameroon. Learning about this while approaching the Africa Center for help with developing Africa-related entries on the Hebrew Wikipedia, Wikimedia Israel decided to equip the students with computers running free software and containing an offline (static) version of the French Wikipedia, so that the students can bring free knowledge to Africans without access to the Internet. Wikimedia Israel reached out to Hamakor, the Israeli Free and Open Source Software NGO, and Hamakor helped obtain computer donations, refurbished them and installed the Linux operating system on them. Wikimedia Israel collaborated with members of Wikimedia Switzerland and Wikimedia France to produce an up-to-date static version of the French Wikipedia (numbering about 1 million entries, and including images), French being a major language of reading and writing in Cameroon and Benin. "The students also have portable installations of the offline Wikipedia, so that they may install it on any other computers they may run across in Africa," explained Asaf Bartov, who coordinated the project in Wikimedia Israel, "and they have received training on using Linux and Kiwix, the offline Wikipedia reader (free) software, so they may train others to use the computers". Incidentally, the Linux version installed on those computers is called Ubuntu Linux, 'Ubuntu' being an African word (in the Zulu language) roughly translatable as "unity of mankind" or "mutual reliance". Supporting and promoting the distribution of free knowledge in developing countries is one of the five major goals identified by the Wikimedia Foundation as central to its five-year strategy plan, developed by thousands of members of the Wikimedia Movement. === END OF PRESS RELEASE === I'd like to specifically express our gratitude to Emmanuel Engelhart, chief developer of Kiwix, for all his help in getting the content ready and working on a tight schedule. We are very pleased with this project, and consider it a prime example for seizing an opportunity for outreach that we never expected to come our way. As always, questions, comments, translations and relaying (of the press release) to other lists/communities are welcome. Asaf Bartov Wikimedia Israel -- Asaf Bartov <asaf.bartov(a)gmail.com <mailto:asaf.bartov@gmail.com>> -- Asaf Bartov <asaf.bartov(a)gmail.com <mailto:asaf.bartov@gmail.com>>

1 0

[openZIM dev-l] a DVD with the Spanish Wikipedia (was [Argentina] WikiBrowse improvements)
by Alejandro J. Cura 14 Sep '10

14 Sep '10

Hi everyone, we need your help. We are from Python Argentina, and we are working on adapting our cdpedia project to make a DVD together with educ.ar and Wikimedia Foundation, holding the entire Spanish Wikipedia that will be sent soon to Argentinian schools. Hernán and Diego are the two interns tasked with updating the data that cdpedia uses to make the cd (it currently uses a static html dump dated June 2008), but they are encountering some problems while trying to make an up to date static html es-wikipedia dump. I'm ccing this list of people, because I'm sure you've faced similar issues when making your offline wikipedias, or because maybe you know someone who can help us. Following is an email from Hernán describing the problems he's found. thanks! -- alecu - Python Argentina 2010/4/30 Hernan Olivera <lholivera(a)gmail.com>: Hi everybody, I've been working on making an up to date static html dump for the spanish wikipedia, to use as a basis for the DVD. I've followed the procedures detailed in the pages below, that were used to generate the current (and out of date) static html dumps: 1) installing and setting up a mediawiki instance 2) importing the xml from [6] with mwdumper 3) exporting the static html with mediawiki's tool The procedure finishes without throwing any errors, but the xml import produces malformed html pages that have visible wikimarkup. We would really need to have a successful import from the spanish xmls to a mediawiki instance so we can produce the up to date static html dump. Links to the info I used: [0] http://www.mediawiki.org/wiki/Manual:Installation_guide/es [1] http://www.mediawiki.org/wiki/Manual:Running_MediaWiki_on_Ubuntu [2] http://en.wikipedia.org/wiki/Wikipedia_database [3] http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps [4] http://meta.wikimedia.org/wiki/Importing_a_Wikipedia_database_dump_into_Med… [5] http://meta.wikimedia.org/wiki/Data_dumps [6] http://dumps.wikimedia.org/eswiki/20100331/ [7] http://www.mediawiki.org/wiki/Alternative_parsers (among others) Cheers, -- Hernan Olivera PS: unluckily I didn't write down every step in detail. I did a lot more tests than what I wrote here. To make a detailed report I'd like to go thru the procedure again writing down every option (and to check if I missed something). I'm finishing installing a server just for this, because this processes take forever and they blocked other tasks while making this tests. 2009/10/23 Samuel Klein <meta.sj(a)gmail.com>: > Jimbo - thanks for the spur to clean up the existing work. > > All - Let's start by cleaning up the mailing lists and setting a few > short-term goals :-) It's a good sign that we have both charity and love > converging to make something happen. > > * For all-platform all-purpose wikireaders, let's use > offline-l(a)lists.wikimedia, as we discussed a month ago in the aftermath of > Wikimania (Erik, were you going to set this up? I think we agreed to > deprecate wiki-offline-reader-l and replace it with offline-l.) > > * For wikireaders such as WikiBrowse and Infoslicer on the XO, please > continue to use wikireader(a)lists.laptop > > > I would like to see WikiBrowse become the 'sugarized' version of a reader > that combines the best of that and the openZim work. A standalone DVD or > USB drive that comes with its own search tools would be another version of > the same. As far as merging codebases goes, I don't think the WikiBrowse > developers are invested in the name. > > I think we have a good first cut at selecting articles, weeding out stubs, > and including thumbnail images. Maybe someone working on openZim can > suggest how to merge the search processes, and that file format seems > unambiguously better. > > Kul - perhaps part of the work you've been helping along for standalone > usb-key snapshots would be useful here. > > > Please continue to update this page with your thoughts and progress! > http://meta.wikimedia.org/wiki/Offline_readers > > SJ > > > 2009/10/23 Iris Fernández <irisfernandez(a)gmail.com> >> >> On Fri, Oct 23, 2009 at 1:37 PM, Jimmy Wales <jwales(a)wikia-inc.com> wrote: >> > >> > My dream is quite simple: a DVD that can be shipped to millions of >> > people with an all-free-software solution for reading Wikipedia in Spanish. >> > It should have a decent search solution, doesn't have to be perfect, but it >> > should be full-text. It should be reasonably fast, but super-perfect is not >> > a consideration. >> > >> >> Hello! I am an educator, not a programmer. I can help selecting >> articles or developing categories related to school issues. > > Iris - you know the main page of WikiBrowse that you see when the reader > first loads? You could help with a new version of that page. Madeleine > (copied here) worked on the first one, but your thoughts on improving it > would be welcome. > > >

10 13

[openZIM dev-l] [KIWIX] New ZIM files + Version 0.9 alpha6
by Emmanuel Engelhart 13 Sep '10

13 Sep '10

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, we were all agree that it started to be problematic not having more ZIM files. This problem should be IMO slowly fixed. I have recently released new full Wikipedia ZIM files (like always with the thumbnails) in Spanish and Portuguese: * http://tmp.kiwix.org/zim/0.9/wikipedia_es_all_09_2010_beta1.zim * http://tmp.kiwix.org/zim/0.9/wikipedia_pt_all_10_2010_alpha1.zim I have also released 10 days ago Kiwix 0.9 alpha6 fixing a few bugs and introducing the tab navigation. Here is the CHANGELOG http://kiwix.svn.sourceforge.net/viewvc/kiwix/moulinkiwix/CHANGELOG This version, thanks to last libzim improvements, is also able to deal with splitted ZIM files... which is "mandatory" to store big ZIM files on USB key FAT32 FS. I can now automatically create from any ZIM file a portable version of Kiwix for Windows. Simply unpack the ZIP file on your USB and it works: all the necessary is there (autorun, ZIM, search index, installer, source codes, packages,...) http://tmp.kiwix.org/portable/ Be informed about the next steps: http://identi.ca/group/kiwix Emmanuel -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkyOesMACgkQn3IpJRpNWtO5YACfYqWOkYP3tlnAy7KbAqIQCTN/ 9Z4AoM7ZbbjV/FG5dfOsLotCxNL8jM1J =SMsL -----END PGP SIGNATURE-----

1 0

[openZIM dev-l] Zim for Android
by Pascal Martin 13 Sep '10

13 Sep '10

Hello all, We worked on porting Okawix ( http://okawix.com ) to the Android platform. So far we included Zeno files support with both gzip and bz2 compression formats. Now, we wants to add Zim files support but we're stuck due to the compression format: Zim files are generated using the LZMA2 format which is not easily available on the Android / Java platform. On the other hand, the OpenZim website lists gzip and bz2 as possible compression format for Zim files. Would it be possible to generate a Zim file with of those compression formats? Or is the documentation on the website out of date? We only have two weeks left to port Okawix to the Android platform and if we can't get Zim support working in this delay, we would have to release Okawix with Zeno files support only. Thank you for your help to developp an opensource offline reader including ZIM for Java Android. Sincerely Pascal Martin

3 3

[openZIM dev-l] openZIM Meeting tonight
by Manuel Schneider 13 Sep '10

13 Sep '10

Dear all, I may welcome Heiko from Pediapress in the team. We were in touch with him during Wikimedia Conference in April this year when we started to think about adding openZIM into the Collections Extension of MediaWiki. One of the drivers was Asaf whereas the Collections Extension itself is developed and maintained by Pediapress. Currently Pediapress is looking seriously into the actual implementation of openZIM. For the start there are now some question on how to do it the best way etc., as far as I understood a test environment is already available. So if you have time please meet tonight with Heiko in our IRC channel on Freenode, #openzim. Thanks for your support, /Manuel -- Regards Manuel Schneider Wikimedia CH - Verein zur Förderung Freien Wissens Wikimedia CH - Association for the advancement of free knowledge www.wikimedia.ch

1 0

← Newer
1
...
47
48
49
50
51
52
53
...
71
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Offline-l