Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).
For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.
I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com system) so that users could connect to it and live notices.
Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?
Best Z. Blace
Sounds like a lovely idea. Mapping to tiddlywiki would be very interesting.
What are the kiwix tools for compilation these days? WikiBrowse http://wiki.sugarlabs.org/go/Activities/Wikipedia/HowTo used to do something similar.
On Wed, Jul 6, 2022 at 6:09 AM Željko Blaće zblace@mi2.hr wrote:
Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).
For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.
I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com system) so that users could connect to it and live notices.
Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?
Best Z. Blace
Offline-l mailing list -- offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org
Ok this sounds a lot like our Wikipedia-on-Demand project, ie enter a list of articles on one end and get the corresponding zim file on the other end. Release is planned for this Fall, so we’ll make sure to put out a call so people can test it.
Doing something that works cross-wiki, on the other hand, is another ball game entirely and not on our roadmap at the moment (but make no mistake, this would be a very cool feature).
SCM
Le 6 juil. 2022 à 15:01, Samuel Klein meta.sj@gmail.com a écrit :
Sounds like a lovely idea. Mapping to tiddlywiki would be very interesting.
What are the kiwix tools for compilation these days? WikiBrowse http://wiki.sugarlabs.org/go/Activities/Wikipedia/HowTo used to do something similar.
On Wed, Jul 6, 2022 at 6:09 AM Željko Blaće <zblace@mi2.hr mailto:zblace@mi2.hr> wrote: Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org http://portal.toolforge.org/) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).
For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.
I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com https://tiddlywiki.com/ system) so that users could connect to it and live notices.
Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?
Best Z. Blace
Offline-l mailing list -- offline-l@lists.wikimedia.org mailto:offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org mailto:offline-l-leave@lists.wikimedia.org
-- Samuel Klein @metasj w:user:sj +1 617 529 4266 _______________________________________________ Offline-l mailing list -- offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org
Given a wiki and a list of article titles, we can make an offline snapshot easily. To me the question is: What kind of selections? Based on which approach? Using which data exactly?... and then probably find to apply it and get the list of article titles.
On 06.07.22 11:50, Željko Blaće wrote:
Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org http://Portal.toolforge.org) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).
For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.
I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com https://tiddlywiki.com system) so that users could connect to it and live notices.
Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?
Best Z. Blace
Offline-l mailing list -- offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org
Firstly, thank you all for your fast and useful inputs!
Pardon me for answering all together in a compiled way, with my comments below and inline yours...
On Thu, Jul 7, 2022 at 9:32 AM Emmanuel Engelhart kelson@kiwix.org wrote:
Given a wiki and a list of article titles, we can make an offline snapshot easily. To me the question is: What kind of selections? Based on which approach? Using which data exactly?... and then probably find to apply it and get the list of article titles.
I can easily say for a well covered monument it would be at least: - Wikipedia pages on the Monument, Author, Location, Time of making (ideally in at least 2 languages, possibly in two styles like EN & Simple) - Commons page as a Gallery of related images, audio, video...3D model? (ideally in higher and lower resolutions) - Wikispore page with non-encyclopedic info on location and/or contexts (ideally artistic, social and geographic focuses)
On Wed, Jul 6, 2022 at 5:16 PM Stephane Coillet-Matillon stephane@kiwix.org wrote:
Ok this sounds a lot like our Wikipedia-on-Demand project,
What is the way to follow it? Public repository?
ie enter a list of articles on one end and get the corresponding zim file on the other end. Release is planned for this Fall, so we’ll make sure to put out a call so people can test it.
Great! Happy to test.
Doing something that works cross-wiki, on the other hand, is another ball game entirely and not on our roadmap at the moment (but make no mistake, this would be a very cool feature).
Nice that you think so...happy to brainstorm more if there is good context for this.
SCM
Le 6 juil. 2022 à 15:01, Samuel Klein meta.sj@gmail.com a écrit :
Sounds like a lovely idea.
Thank you!
Mapping to tiddlywiki would be very interesting.
Yes I think we need to consider other ways of wiki making. Maybe fed.wiki.org could also be interesting to experiment with.
What are the kiwix tools for compilation these days? WikiBrowse http://wiki.sugarlabs.org/go/Activities/Wikipedia/HowTo used to do something similar.
Will look into it!
On Wed, Jul 6, 2022 at 6:09 AM Željko Blaće zblace@mi2.hr wrote:
Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org http://portal.toolforge.org/) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).
For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.
I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com system) so that users could connect to it and live notices.
Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?
Best Z. Blace
On 07.07.22 22:25, Željko Blaće wrote:
Firstly, thank you all for your fast and useful inputs!
Pardon me for answering all together in a compiled way, with my comments below and inline yours...
On Thu, Jul 7, 2022 at 9:32 AM Emmanuel Engelhart <kelson@kiwix.org mailto:kelson@kiwix.org> wrote:
Given a wiki and a list of article titles, we can make an offline snapshot easily. To me the question is: What kind of selections? Based on which approach? Using which data exactly?... and then probably find to apply it and get the list of article titles.
I can easily say for a well covered monument it would be at least:
- Wikipedia pages on the Monument, Author, Location, Time of making
(ideally in at least 2 languages, possibly in two styles like EN & Simple)
- Commons page as a Gallery of related images, audio, video...3D model?
(ideally in higher and lower resolutions)
- Wikispore page with non-encyclopedic info on location and/or contexts
(ideally artistic, social and geographic focuses)
We can not - for now - create one snapshot with articles coming from multiple Mediawikis. This is an open challenge.
On Wed, Jul 6, 2022 at 5:16 PM Stephane Coillet-Matillon <stephane@kiwix.org mailto:stephane@kiwix.org> wrote:
Ok this sounds a lot like our Wikipedia-on-Demand project,
What is the way to follow it? Public repository?
Project page https://meta.wikimedia.org/wiki/Kiwix/Wikipedia_on_demand.
Most of the work will happen (at least in a first stage) at https://github.com/openzim/wp1
ie enter a list of articles on one end and get the corresponding zim file on the other end. Release is planned for this Fall, so we’ll make sure to put out a call so people can test it.
Great! Happy to test.
Doing something that works cross-wiki, on the other hand, is another ball game entirely and not on our roadmap at the moment (but make no mistake, this would be a very cool feature).
Nice that you think so...happy to brainstorm more if there is good context for this.
If you are ready to invest a bit of time on this, I can send you an invitation to our Slack and put you in touch with the few people involved on this. Kelson
On Thursday, July 14, 2022, Emmanuel Engelhart kelson@kiwix.org wrote:
On 07.07.22 22:25, Željko Blaće wrote:
Firstly, thank you all for your fast and useful inputs!
Pardon me for answering all together in a compiled way, with my comments below and inline yours...
On Thu, Jul 7, 2022 at 9:32 AM Emmanuel Engelhart <kelson@kiwix.org mailto:kelson@kiwix.org> wrote:
Given a wiki and a list of article titles, we can make an offline snapshot easily. To me the question is: What kind of selections? Based on which approach? Using which data exactly?... and then probably find to apply it and get the list of article titles.
I can easily say for a well covered monument it would be at least:
- Wikipedia pages on the Monument, Author, Location, Time of making
(ideally in at least 2 languages, possibly in two styles like EN & Simple)
- Commons page as a Gallery of related images, audio, video...3D model?
(ideally in higher and lower resolutions)
- Wikispore page with non-encyclopedic info on location and/or contexts
(ideally artistic, social and geographic focuses)
We can not - for now - create one snapshot with articles coming from multiple Mediawikis. This is an open challenge.
On Wed, Jul 6, 2022 at 5:16 PM Stephane Coillet-Matillon <
stephane@kiwix.org mailto:stephane@kiwix.org> wrote:
Ok this sounds a lot like our Wikipedia-on-Demand project,
What is the way to follow it? Public repository?
Project page https://meta.wikimedia.org/wiki/Kiwix/Wikipedia_on_demand.
Most of the work will happen (at least in a first stage) at https://github.com/openzim/wp1
ie enter a list of articles on one end and get the corresponding zim
file on the other end. Release is planned for this Fall, so we’ll make sure to put out a call so people can test it.
Great! Happy to test.
Doing something that works cross-wiki, on the other hand, is another ball game entirely and not on our roadmap at the moment (but make no mistake, this would be a very cool feature).
Nice that you think so...happy to brainstorm more if there is good context for this.
If you are ready to invest a bit of time on this, I can send you an invitation to our Slack and put you in touch with the few people involved on this. Kelson
OK - please do. I dislike and avoid Slack, but can visit it in web... Finishing some projects soon, so I can contribute more in a Autumn, but can likely lurk and start picking up things before.
Best Z.
Yes, it sounds like the missing link here is a tool for creating the list of resources to offline. Z made one particular specification, but I suppose it could be made a little more general, and potentially even leverage some existing general-purpose pageset curation tools, such as PetScan. AFAIK, PetScan currently doesn't support the use-case of "get me N levels of pages linked from this first page (or from these P first pages)", but we can imagine (and advocate for) PetScan supporting it at some point.
Then, a PetScan query ID (which is enough to generate the page-set) can be an input to the Wikipedia-on-Demand tool, the problem is solved. (Well, almost: we'd still need to specify the logic for collecting related resources -- i.e. none/some/all images included in the pages, Wikidata items, etc.)
A.
Asaf Bartov (he/him/his)
Senior Program Officer, Emerging Wikimedia Communities
Wikimedia Foundation https://wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org
On Thu, Jul 7, 2022 at 10:32 AM Emmanuel Engelhart kelson@kiwix.org wrote:
Given a wiki and a list of article titles, we can make an offline snapshot easily. To me the question is: What kind of selections? Based on which approach? Using which data exactly?... and then probably find to apply it and get the list of article titles.
On 06.07.22 11:50, Željko Blaće wrote:
Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org http://Portal.toolforge.org) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).
For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.
I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com https://tiddlywiki.com system) so that users could connect to it and live notices.
Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?
Best Z. Blace
Offline-l mailing list -- offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org
-- Kiwix - Wikipedia Offline & more
- Web: https://kiwix.org/
- Twitter: https://twitter.com/KiwixOffline
- Wiki: https://wiki.kiwix.org/
Offline-l mailing list -- offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org
On 11.07.22 22:37, Asaf Bartov wrote:
Yes, it sounds like the missing link here is a tool for creating the list of resources to offline. Z made one particular specification, but I suppose it could be made a little more general, and potentially even leverage some existing general-purpose pageset curation tools, such as PetScan. AFAIK, PetScan currently doesn't support the use-case of "get me N levels of pages linked from this first page (or from these P first pages)", but we can imagine (and advocate for) PetScan supporting it at some point.
Then, a PetScan query ID (which is enough to generate the page-set) can be an input to the Wikipedia-on-Demand tool, the problem is solved. (Well, almost: we'd still need to specify the logic for collecting related resources -- i.e. none/some/all images included in the pages, Wikidata items, etc.)
At a high level, this is indeed the kind of challenge we face now. This is a lack of tool which has been identified already a long time ago at Kiwix. I believe we are now ready to move forward on this because the underlying software pieces are ready.
The overall strategy is to extend wp1.openzim.org (API) to allow to implement sophisticated selection modules. So far, how these modules will look like at the end is really open and all the ideas are welcome (please open tickets at https://github.com/openzim/wp1). Collaborating/Relying with/on PetScan is an idea which should be assessed.
Once a selection done, our Zimfarm infrastructure is ready to build the snapshots (ZIM files). We will probably have to build (a) dedicated frontend(s) to bring these two tools together in a user friendly manner.
This is the goal of the Wikipeda-on-Demand project (WMCH granted) https://meta.wikimedia.org/wiki/Kiwix/Wikipedia_on_demand we have started to work on. We have an other project (related to the war in Ukraine) in the pipe which should even extend this tool.
Kelson