Offline-l September 2010

offline-l@lists.wikimedia.org

8 participants
20 discussions

[openZIM dev-l] [Bug 18] New: debian needs an init.d script
by bugzilla-daemon＠openzim.org 02 Sep '11

02 Sep '11

http://bugs.openzim.org/show_bug.cgi?id=18 Summary: debian needs an init.d script Product: openZIM Version: unspecified Platform: PC OS/Version: Linux Status: NEW Severity: enhancement Priority: P5 Component: zimreader AssignedTo: tommi(a)tntnet.org ReportedBy: andyr(a)wizzy.com CC: dev-l(a)openzim.org Estimated Hours: 0.0 One attached. -- Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

1 4

Re: [Offline-l] Fwd: [Internal-l] News from WM-AR
by Manuel Schneider 04 Oct '10

04 Oct '10

Hi Ted, Asaf has forwarded me your mail to internal, I will answer it here because I think it belongs to this list. > ---------- Forwarded message ---------- > From: *Ted Chien* <hsiangtai.chien(a)gmail.com > <mailto:hsiangtai.chien@gmail.com>> > Date: Wed, Sep 29, 2010 at 8:54 PM > Just read through the OpenZIM wiki and wondering what I can do as an > mobile developer.... openZIM actually provides two things: * a file format, standard, to store compressed wikis (and other web data) -> ZIM * an open source implementation of ZIM, currently available in C++ from our SVN -> see http://openzim.org/Mission_of_openZIM If you are a mobile developer you could be interested in developing a ZIM reader on your favourite platform, eg. the iPhone or Android. -> see also http://openzim.org/Google_Sommer_of_Code_2010 As far as you can use libzim which is written in C++ and gives you an interface to retrieve contents from ZIM files easily, you just need to create some GUI that allows a user to select (open) a ZIM file, a search window and a HTML viewer window which actually displays the selected content. There is a ZIM reader on Symbian, WikiOnBoard: -> see http://github.com/cip/WikiOnBoard It uses our zimlib and Qt for the GUI. It works very well on my Nokia E71. We are planning to have a Developers Meeting in Haifa next year, during the Hacking Days prior to Wikimania. It is not decided yet as we have to assign most of our yearly budget to it, but we will take the decision on our next Developers in October and I am confident that the team agrees with it. This would be a good opportunity to bring more technical-related people together to get actually things done. Another open construction site are the works on the Extension:Collection to integrate a ZIM export there. This would make the process to create ZIM files much easier. If you'd like to discuss ZIM related technical issues I recommend doing this in dev-l(a)openzim.org while general offline discussions should happen here. Regards, Manuel -- Regards Manuel Schneider Wikimedia CH - Verein zur Förderung Freien Wissens Wikimedia CH - Association for the advancement of free knowledge www.wikimedia.ch

2 1

[openZIM dev-l] [STRASBOURG, 16/17 September] Next developer Meeting
by Emmanuel Engelhart 30 Sep '10

30 Sep '10

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, thank you for voting for the next openZIM dev. Meeting location and date. Considering your votes, it will take place in Strasbourg (France) in around one month, the 16/17 September Week-end. I have now to reserve ASAP Hotel and seminar rooms. I will in person be there from Friday afternoon to Sunday evening. Essential is for me to know if you come on Friday evening or on Saturday morning. Please fill the following table until next tuesday 4PM (21 September) with the necessary informations: http://www.openzim.org/Developer_Meetings/2010-2#Participants This is important! I will consider informations in this table to prepare your coming. Regards Emmanuel -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkyVzXYACgkQn3IpJRpNWtMANwCfY0YLJouUeG7T05k0Xavr2RqB XHAAn2WixDFfD3LQtaM/sSdyT6QWN+TQ =8VZI -----END PGP SIGNATURE-----

3 5

[openZIM dev-l] Deal with splitted ZIM files
by Emmanuel Engelhart 30 Sep '10

30 Sep '10

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I need to have Kiwix able to deal with splitted ZIM files like discussed in June. Tommi seems to have implemented that in the zimlib but I do not achieve to deal with. My use case is a 11GB big file splitted in 3 smaller files called xaa, xab and xac. I have moved all of them in /tmp/. But, new zim::File(zimFilePath) returns the following error: error 2 opening file "/tmp/xaa:/tmp/xab:/tmp/xac": Aucun fichier ou dossier de ce type I also called zimdump -F "/tmp/xaa:/tmp/xab:/tmp/xac" and get the same error. So Christian or Tommi, how should i do? Thx in advance Emmanuel -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkxv9m0ACgkQn3IpJRpNWtOhkgCgoZxOQBAFJCdoQHpGpVrfO4zz pV8AoJ8cNY9rB1QkYW/sGlZWWp+PgqxR =6S0a -----END PGP SIGNATURE-----

4 6

[openZIM dev-l] [Bug 24] New: Zimdump -d DIRECTORY should be able to create the DIRECTORY if necessary
by bugzilla-daemon＠openzim.org 30 Sep '10

30 Sep '10

http://bugs.openzim.org/show_bug.cgi?id=24 Summary: Zimdump -d DIRECTORY should be able to create the DIRECTORY if necessary Product: openZIM Version: unspecified Platform: All OS/Version: All Status: NEW Severity: trivial Priority: P5 Component: zimlib AssignedTo: tommi(a)tntnet.org ReportedBy: emmanuel(a)engelhart.org CC: dev-l(a)openzim.org Estimated Hours: 0.0 zimdump -D test file.zim error writing file test/A/#ubuntu-fr-meeting If I create the test directory it works. IMO this should not trigger an error and simply create it (the directory) if necessary. -- Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

1 5

[openZIM dev-l] [Bug 21] New: Zimlib should allow to get/unpack only a part of a content
by bugzilla-daemon＠openzim.org 27 Sep '10

27 Sep '10

http://bugs.openzim.org/show_bug.cgi?id=21 Summary: Zimlib should allow to get/unpack only a part of a content Product: openZIM Version: unspecified Platform: PC OS/Version: Windows Status: NEW Severity: enhancement Priority: P5 Component: zimlib AssignedTo: tommi(a)tntnet.org ReportedBy: emmanuel(a)engelhart.org CC: dev-l(a)openzim.org Estimated Hours: 0.0 The zimlib needs to fully unpack a content before giving delivering it to a third part software. This has many disadvantages especially if the content is big, a video for example: * This will need pretty much memory * This will take time * You do not have a random access (necessary to seek in an HTML5 video) It would be a really good usability improvement to have a method which delivers only a part of any content. -- Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

1 2

[openZIM dev-l] [Bug 25] New: Broken post increment and decrement (patch included)
by bugzilla-daemon＠openzim.org 26 Sep '10

26 Sep '10

http://bugs.openzim.org/show_bug.cgi?id=25 Summary: Broken post increment and decrement (patch included) Product: openZIM Version: unspecified Platform: All OS/Version: All Status: NEW Severity: normal Priority: P5 Component: zimlib AssignedTo: tommi(a)tntnet.org ReportedBy: guillaume.duhamel(a)gmail.com CC: dev-l(a)openzim.org Estimated Hours: 0.0 Created an attachment (id=8) --> (http://bugs.openzim.org/attachment.cgi?id=8) Patch to fix post inc / dec The current post increment and decrement in fileiterator.h are just doing nothing: they copy "this" in a local object, increment/decrement the local copy and returns the unmodified "this". Here's a patch that fixes the problem by incrementing/decrementing "this" and returning the local copy (which is the expected behavior). The patch also fix the post decrement signature. -- Configure bugmail: http://bugs.openzim.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

1 1

Re: [openZIM dev-l] [Toolserver-l] Static dump of German Wikipedia
by Manuel Schneider 24 Sep '10

24 Sep '10

Well, afaik PediaPress, openZIM and a few others started working to enhance the Extension:Collection to create ZIM files which is actually a special compressed HTML format. We had a Skype conference two weeks ago, but I am not in the loop what happened since then. My last status is that Tommi from openZIM was going to fix the zimwriter interfaces so the filesource plugin can be used for this. /Manuel Am 24.09.2010 04:27, schrieb Q: >> Given the fact that static dumps have been broken for *years* now, >> static dumps are on the bottom of WMFs priority list; I thought it >> would be the best if I just went ahead and built something that can be >> used (and, of course, improved). >> >> Marco > > That's what I just said. Work with them to fix it, IE: volunteer. IE: > you fix it. > > _______________________________________________ > Toolserver-l mailing list (Toolserver-l(a)lists.wikimedia.org) > https://lists.wikimedia.org/mailman/listinfo/toolserver-l > Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette > -- Regards Manuel Schneider Wikimedia CH - Verein zur Förderung Freien Wissens Wikimedia CH - Association for the advancement of free knowledge www.wikimedia.ch

2 1

[openZIM dev-l] Fwd: [Toolserver-l] Static dump of German Wikipedia
by Manuel Schneider 24 Sep '10

24 Sep '10

Emmanuel, maybe you can help him with a ZIM file? /Manuel -------- Original-Nachricht -------- Betreff: [Toolserver-l] Static dump of German Wikipedia Datum: Fri, 24 Sep 2010 01:57:55 +0200 Von: Marco Schuster <marco(a)harddisk.is-a-geek.org> Antwort an: toolserver-l(a)lists.wikimedia.org An: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>, toolserver-l <toolserver-l(a)lists.wikimedia.org> Hi all, I have made a list of all the 1.9M articles in NS0 (including redirects / short pages) using the Toolserver; now I have the list I'm going to download every single of 'em (after the trial period tonight, I want to see how this works out. I'd like to begin with downloading the whole thing in 3 or 4 days, if noone objects) and then publish a static dump of it. Data collection will be on the Toolserver (/mnt/user-store/dewiki-static/articles/); the request rate will be 1 article per second and I'll download the new files once or twice a day to my home PC, so there should be no problem with the TS or Wikimedia server load. When this is finished in ~ 21-22 days, I'm going to compress them and upload them to my private server (well, if Wikimedia has an archive server, that 'd be better) as a tgz file so others can play with it. Furthermore, though I have no idea if I'll succeed, I plan on hacking a static Vector skin file which will load the articles using jQuery's excellent .load() feature, so that everyone with JS can enjoy a truly offline Wikipedia. Marco PS: When trying to invoke /w/index.php?action=render with an invalid oldid, the server returns HTTP/1.1 200 OK and an error message, but shouldn't this be a 404 or 500? -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de _______________________________________________ Toolserver-l mailing list (Toolserver-l(a)lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette

1 0

[openZIM dev-l] WG: [Wikitech-l] Parser implementaton for MediaWiki syntax
by Manuel Schneider 23 Sep '10

23 Sep '10

Someone created a MediaWiki parser written in C - please see the mail below. Greetings from Linux-Kongress in Nürnberg, /Manuel Sent via mobile phone. -- Urspr. Mitt. -- Betreff: [Wikitech-l] Parser implementaton for MediaWiki syntax Von: Andreas Jonsson <andreas.jonsson(a)kreablo.se> Datum: 23.09.2010 11:28 Hi, I have written a parser for MediaWiki syntax and have set up a test site for it here: http://libmwparser.kreablo.se/index.php/Libmwparsertest and the source code is available here: http://svn.wikimedia.org/svnroot/mediawiki/trunk/parsers/libmwparser A preprocessor will take care of parser functions, magic words, comment removal, and transclusion. But as it wasn't possible to cleanly separate these functions from the existing preprocessor, some preprocessing is disabled at the test site. It should be straightforward to write a new preprocessor that provides only the required functionality, however. The parser is not feature complete, but the hard parts are solved. I consider "the hard parts" to be: * parsing apostrophes * parsing html mixed with wikitext * parsing headings and links * parsing image links And when I say "solved" I mean producing the same or equivalent output as the original parser, as long as the behavior of the original parser is well defined and produces valid html. Here is a schematic overview of the design: +-----------------------+ | | Wikitext | client application +---------------------------------------+ | | | +-----------------------+ | ^ | | Event stream | +----------+------------+ +-------------------------+ | | | | | | | parser context |<------>| Parser | | | | | | | +-----------------------+ +-------------------------+ | ^ | | Token stream | +-----------------------+ +------------+------------+ | | | | | | | lexer context |<------>| Lexer |<---+ | | | | +-----------------------+ +-------------------------+ The design is described more in detail in a series of posts at the wikitext-l mailing list. The most important "trick" is to make sure that the lexer never produce a spurious token. An end token for a production will not appear unless the corresponding begin token already has been produced, and the lexer maintains a block context to only produce tokens that makes sense in the current block. I have used Antlr for generating both the parser and the lexer, as Antlr supports semantic predicates that can be used for context sensitive parsing. Also I am using a slightly patched version of Antlr's C runtime environent, because the lexer needs to support speculative execution in order to do context sensitive lookahead. A Swig generated interface is used for providing the php api. The parser process the buffer of the php string directly, and writes its output to an array of php strings. Only UTF-8 is supported at the moment. The performance seems to be about the same as for the original parser on plain text. But with an increasing amount of markup, the original parser runs slower. This new parser implementation maintains roughly the same performance regardless of input. I think that this demonstrates the feasability of replacing the MediaWiki parser. There is still a lot of work to do in order to turn it into a full replacement, however. Best regards, Andreas _______________________________________________ Wikitech-l mailing list Wikitech-l(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

2 1

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Offline-l September 2010