offline wikipages within 2 degrees of separation

List overview All Threads
Download

newer

older

Fwd: Wikimedia Summit 2022 online...

Fwd: Ask questions for the 2022...

Željko Blaće

6 Jul 2022 6 Jul '22

9:50 a.m.

Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).

For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.

I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com system) so that users could connect to it and live notices.

Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?

Best Z. Blace

Attachments:

attachment.htm (text/html — 1.2 KB)

Show replies by date

Samuel Klein

6 Jul 6 Jul

1:01 p.m.

Sounds like a lovely idea. Mapping to tiddlywiki would be very interesting.

What are the kiwix tools for compilation these days? WikiBrowse http://wiki.sugarlabs.org/go/Activities/Wikipedia/HowTo used to do something similar.

On Wed, Jul 6, 2022 at 6:09 AM Željko Blaće zblace@mi2.hr wrote:

...

Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).

For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.

I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com system) so that users could connect to it and live notices.

Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?

Best Z. Blace

Offline-l mailing list -- offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org

-- Samuel Klein @metasj w:user:sj +1 617 529 4266

Stephane Coillet-Matillon

3:15 p.m.

Ok this sounds a lot like our Wikipedia-on-Demand project, ie enter a list of articles on one end and get the corresponding zim file on the other end. Release is planned for this Fall, so we’ll make sure to put out a call so people can test it.

Doing something that works cross-wiki, on the other hand, is another ball game entirely and not on our roadmap at the moment (but make no mistake, this would be a very cool feature).

SCM

...

Le 6 juil. 2022 à 15:01, Samuel Klein meta.sj@gmail.com a écrit :

Sounds like a lovely idea. Mapping to tiddlywiki would be very interesting.

What are the kiwix tools for compilation these days? WikiBrowse http://wiki.sugarlabs.org/go/Activities/Wikipedia/HowTo used to do something similar.

On Wed, Jul 6, 2022 at 6:09 AM Željko Blaće <zblace@mi2.hr mailto:zblace@mi2.hr> wrote: Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org http://portal.toolforge.org/) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).

For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.

I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com https://tiddlywiki.com/ system) so that users could connect to it and live notices.

Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?

Best Z. Blace

Offline-l mailing list -- offline-l@lists.wikimedia.org mailto:offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org mailto:offline-l-leave@lists.wikimedia.org

-- Samuel Klein @metasj w:user:sj +1 617 529 4266 _______________________________________________ Offline-l mailing list -- offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org

Emmanuel Engelhart

7 Jul 7 Jul

7:32 a.m.

Given a wiki and a list of article titles, we can make an offline snapshot easily. To me the question is: What kind of selections? Based on which approach? Using which data exactly?... and then probably find to apply it and get the list of article titles.

On 06.07.22 11:50, Željko Blaće wrote:

...

Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org http://Portal.toolforge.org) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).

For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.

I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com https://tiddlywiki.com system) so that users could connect to it and live notices.

Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?

Best Z. Blace

Offline-l mailing list -- offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org

-- Kiwix - Wikipedia Offline & more * Web: https://kiwix.org/ * Twitter: https://twitter.com/KiwixOffline * Wiki: https://wiki.kiwix.org/

Željko Blaće

8:25 p.m.

Firstly, thank you all for your fast and useful inputs!

Pardon me for answering all together in a compiled way, with my comments below and inline yours...

On Thu, Jul 7, 2022 at 9:32 AM Emmanuel Engelhart kelson@kiwix.org wrote:

...

Given a wiki and a list of article titles, we can make an offline snapshot easily. To me the question is: What kind of selections? Based on which approach? Using which data exactly?... and then probably find to apply it and get the list of article titles.

I can easily say for a well covered monument it would be at least: - Wikipedia pages on the Monument, Author, Location, Time of making (ideally in at least 2 languages, possibly in two styles like EN & Simple) - Commons page as a Gallery of related images, audio, video...3D model? (ideally in higher and lower resolutions) - Wikispore page with non-encyclopedic info on location and/or contexts (ideally artistic, social and geographic focuses)

On Wed, Jul 6, 2022 at 5:16 PM Stephane Coillet-Matillon stephane@kiwix.org wrote:

...

Ok this sounds a lot like our Wikipedia-on-Demand project,

What is the way to follow it? Public repository?

...

ie enter a list of articles on one end and get the corresponding zim file on the other end. Release is planned for this Fall, so we’ll make sure to put out a call so people can test it.

Great! Happy to test.

...

Doing something that works cross-wiki, on the other hand, is another ball game entirely and not on our roadmap at the moment (but make no mistake, this would be a very cool feature).

Nice that you think so...happy to brainstorm more if there is good context for this.

...

SCM

Le 6 juil. 2022 à 15:01, Samuel Klein meta.sj@gmail.com a écrit :

Sounds like a lovely idea.

Thank you!

...

Mapping to tiddlywiki would be very interesting.

Yes I think we need to consider other ways of wiki making. Maybe fed.wiki.org could also be interesting to experiment with.

...

What are the kiwix tools for compilation these days? WikiBrowse http://wiki.sugarlabs.org/go/Activities/Wikipedia/HowTo used to do something similar.

Will look into it!

...

On Wed, Jul 6, 2022 at 6:09 AM Željko Blaće zblace@mi2.hr wrote:

...
Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org http://portal.toolforge.org/) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).

For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.

I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com system) so that users could connect to it and live notices.

Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?

Best Z. Blace

Emmanuel Engelhart

14 Jul 14 Jul

7:43 a.m.

On 07.07.22 22:25, Željko Blaće wrote:

...

Firstly, thank you all for your fast and useful inputs!

Pardon me for answering all together in a compiled way, with my comments below and inline yours...

On Thu, Jul 7, 2022 at 9:32 AM Emmanuel Engelhart <kelson@kiwix.org mailto:kelson@kiwix.org> wrote:
Given a wiki and a list of article titles, we can make an offline
snapshot easily. To me the question is: What kind of selections? Based
on which approach? Using which data exactly?... and then probably find
to apply it and get the list of article titles.
I can easily say for a well covered monument it would be at least:

Wikipedia pages on the Monument, Author, Location, Time of making

(ideally in at least 2 languages, possibly in two styles like EN & Simple)

Commons page as a Gallery of related images, audio, video...3D model?

(ideally in higher and lower resolutions)

Wikispore page with non-encyclopedic info on location and/or contexts

(ideally artistic, social and geographic focuses)

We can not - for now - create one snapshot with articles coming from multiple Mediawikis. This is an open challenge.

...

On Wed, Jul 6, 2022 at 5:16 PM Stephane Coillet-Matillon <stephane@kiwix.org mailto:stephane@kiwix.org> wrote:
Ok this sounds a lot like our Wikipedia-on-Demand project, 
What is the way to follow it? Public repository?

Project page https://meta.wikimedia.org/wiki/Kiwix/Wikipedia_on_demand.

Most of the work will happen (at least in a first stage) at https://github.com/openzim/wp1

...

ie enter a list of articles on one end and get the corresponding zim
file on the other end. Release is planned for this Fall, so we’ll
make sure to put out a call so people can test it.

Great! Happy to test.

Doing something that works cross-wiki, on the other hand, is another
ball game entirely and not on our roadmap at the moment (but make no
mistake, this would be a very cool feature).

Nice that you think so...happy to brainstorm more if there is good context for this.

If you are ready to invest a bit of time on this, I can send you an invitation to our Slack and put you in touch with the few people involved on this. Kelson

-- Kiwix - Wikipedia Offline & more * Web: https://kiwix.org/ * Twitter: https://twitter.com/KiwixOffline * Wiki: https://wiki.kiwix.org/

Željko Blaće

7:46 p.m.

On Thursday, July 14, 2022, Emmanuel Engelhart kelson@kiwix.org wrote:

...

On 07.07.22 22:25, Željko Blaće wrote:

...
Firstly, thank you all for your fast and useful inputs!

Pardon me for answering all together in a compiled way, with my comments below and inline yours...

On Thu, Jul 7, 2022 at 9:32 AM Emmanuel Engelhart <kelson@kiwix.org mailto:kelson@kiwix.org> wrote:
Given a wiki and a list of article titles, we can make an offline
snapshot easily. To me the question is: What kind of selections? Based
on which approach? Using which data exactly?... and then probably find
to apply it and get the list of article titles.
I can easily say for a well covered monument it would be at least:

Wikipedia pages on the Monument, Author, Location, Time of making

(ideally in at least 2 languages, possibly in two styles like EN & Simple)
...

Commons page as a Gallery of related images, audio, video...3D model?

(ideally in higher and lower resolutions)

Wikispore page with non-encyclopedic info on location and/or contexts

(ideally artistic, social and geographic focuses)

We can not - for now - create one snapshot with articles coming from multiple Mediawikis. This is an open challenge.

On Wed, Jul 6, 2022 at 5:16 PM Stephane Coillet-Matillon <

...
stephane@kiwix.org mailto:stephane@kiwix.org> wrote:
Ok this sounds a lot like our Wikipedia-on-Demand project,
What is the way to follow it? Public repository?
Project page https://meta.wikimedia.org/wiki/Kiwix/Wikipedia_on_demand.

Most of the work will happen (at least in a first stage) at https://github.com/openzim/wp1
ie enter a list of articles on one end and get the corresponding zim
...
file on the other end. Release is planned for this Fall, so we’ll
make sure to put out a call so people can test it.
Great! Happy to test.
Doing something that works cross-wiki, on the other hand, is another
ball game entirely and not on our roadmap at the moment (but make no
mistake, this would be a very cool feature).
Nice that you think so...happy to brainstorm more if there is good context for this.
If you are ready to invest a bit of time on this, I can send you an invitation to our Slack and put you in touch with the few people involved on this. Kelson

OK - please do. I dislike and avoid Slack, but can visit it in web... Finishing some projects soon, so I can contribute more in a Autumn, but can likely lurk and start picking up things before.

Best Z.

Asaf Bartov

11 Jul 11 Jul

8:37 p.m.

Yes, it sounds like the missing link here is a tool for creating the list of resources to offline. Z made one particular specification, but I suppose it could be made a little more general, and potentially even leverage some existing general-purpose pageset curation tools, such as PetScan. AFAIK, PetScan currently doesn't support the use-case of "get me N levels of pages linked from this first page (or from these P first pages)", but we can imagine (and advocate for) PetScan supporting it at some point.

Then, a PetScan query ID (which is enough to generate the page-set) can be an input to the Wikipedia-on-Demand tool, the problem is solved. (Well, almost: we'd still need to specify the logic for collecting related resources -- i.e. none/some/all images included in the pages, Wikidata items, etc.)

Asaf Bartov (he/him/his)

Senior Program Officer, Emerging Wikimedia Communities

Wikimedia Foundation https://wikimediafoundation.org/

Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org

On Thu, Jul 7, 2022 at 10:32 AM Emmanuel Engelhart kelson@kiwix.org wrote:

...

Given a wiki and a list of article titles, we can make an offline snapshot easily. To me the question is: What kind of selections? Based on which approach? Using which data exactly?... and then probably find to apply it and get the list of article titles.

On 06.07.22 11:50, Željko Blaće wrote:

...
Dear fellows - I would like to offline host minimal set of Wiki pages (Wikipedias, Wikispore, Commons) and minimal set of media (images, video and audio) that all branch out from single Reasonator (or Portal.toolforge.org http://Portal.toolforge.org) search... ...and that keep global links beyond 2 degrees (so that one can use them to continue online).

For now I think of just downloading pages offline and hosting them as mobile wiki-to-static pages using wget with manual corrections.

I would love to do this for monuments and tiny libraries like this https://w.wiki/5QWn https://w.wiki/5QWn so maybe have library-like single-file wiki (like https://tiddlywiki.com https://tiddlywiki.com system) so that users could connect to it and live notices.

Anyone has idea if and how to do this better and package in the most elegant way? Anyone interested in collaborating?

Best Z. Blace

Offline-l mailing list -- offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org

-- Kiwix - Wikipedia Offline & more

Web: https://kiwix.org/

Twitter: https://twitter.com/KiwixOffline

Wiki: https://wiki.kiwix.org/

Offline-l mailing list -- offline-l@lists.wikimedia.org To unsubscribe send an email to offline-l-leave@lists.wikimedia.org

Emmanuel Engelhart

14 Jul 14 Jul

7:39 a.m.

On 11.07.22 22:37, Asaf Bartov wrote:

...

Yes, it sounds like the missing link here is a tool for creating the list of resources to offline. Z made one particular specification, but I suppose it could be made a little more general, and potentially even leverage some existing general-purpose pageset curation tools, such as PetScan. AFAIK, PetScan currently doesn't support the use-case of "get me N levels of pages linked from this first page (or from these P first pages)", but we can imagine (and advocate for) PetScan supporting it at some point.

Then, a PetScan query ID (which is enough to generate the page-set) can be an input to the Wikipedia-on-Demand tool, the problem is solved. (Well, almost: we'd still need to specify the logic for collecting related resources -- i.e. none/some/all images included in the pages, Wikidata items, etc.)

At a high level, this is indeed the kind of challenge we face now. This is a lack of tool which has been identified already a long time ago at Kiwix. I believe we are now ready to move forward on this because the underlying software pieces are ready.

The overall strategy is to extend wp1.openzim.org (API) to allow to implement sophisticated selection modules. So far, how these modules will look like at the end is really open and all the ideas are welcome (please open tickets at https://github.com/openzim/wp1). Collaborating/Relying with/on PetScan is an idea which should be assessed.

Once a selection done, our Zimfarm infrastructure is ready to build the snapshots (ZIM files). We will probably have to build (a) dedicated frontend(s) to bring these two tools together in a user friendly manner.

This is the goal of the Wikipeda-on-Demand project (WMCH granted) https://meta.wikimedia.org/wiki/Kiwix/Wikipedia_on_demand we have started to work on. We have an other project (related to the war in Ukraine) in the pipe which should even extend this tool.

Kelson

-- Kiwix - Wikipedia Offline & more * Web: https://kiwix.org/ * Twitter: https://twitter.com/KiwixOffline * Wiki: https://wiki.kiwix.org/

747

Age (days ago)

755

Last active (days ago)

offline-l@lists.wikimedia.org

8 comments

5 participants

tags (0)

participants (5)

Asaf Bartov
Emmanuel Engelhart
Samuel Klein
Stephane Coillet-Matillon
Željko Blaće