Re: [Wikisource-l] Wikimedia Strategy

28 Mar 2017

Another thing I would be very happy to see in the future is a greater,
systematic collaboration with Internet Archive.
I'm convinced that it's a vital part of our ecosystem, because it allow
easily a lot of things that should be done by skilled users (like create a
PDF/djvu, OCR, etc).
When a I explain Wikisource I always explain Internet Archive first,
teaching people to upload there their files, then into Commons/Wikisource
via the "IA Upload" tool.

This is why the Italian Wikisource community created a dedicated collection
on IA:
https://archive.org/details/itwikisource

To create a collection, you need at least 50 items, and then you can ask
Internet Archive to give you permission.
Right now, Alex brollo is writing some scripts that will allow a better
maintenance of the metadata,
we'll share them when they are ready.

If you create a collection, please tell us: we could even have a greater
"Wikisource" collection, that contains all the linguistic collections.

Maybe this is a bit OT for the strategy, but I think it suggests way to
improve the collaboration between us and IA.

On Fri, Mar 24, 2017 at 10:50 AM, Andrea Zanni &lt;zanni.andrea84(a)gmail.com&gt;
wrote:

...
  Anyone else?
 It would be very good to know the gist of the discussions/opinions you are
 having in your local Wikisource.

 The Italian Wikisource for example is summing this up here:
 https://meta.wikimedia.org/wiki/Strategy/Wikimedia_
 movement/2017/Sources/Italian_Wikisource_Village_pump

 For us, there is a bit of a disagreement about the idea and goal of being
 a "library", and being a "typography": being a library is more
focused on
 access, on services build upon texts (text analysis, text mining,
 searching, hyperlinking, annotation) and the transcribing/proofreading
 part, which needs a whole different level of tools and interface.

 Maybe you are having a similar discussion?
 Do you possibly see a "fork", in the future, of Wikisource in 2 different
 projects, or at least 2 different interfaces?

 Aubrey

 On Mon, Mar 20, 2017 at 10:54 PM, Andrea Zanni &lt;zanni.andrea84(a)gmail.com&gt;
 wrote:

  @Micru: of course, as you say, machine learning
is the elephant in the
 room.
 I dream of something we could call "Wikisource as a platform":
 meaning an environment with structured data and workflows where you can
 have APIs
 and tools for interact with humans and machines, both for input and for
 output.
 We could have OCR software that learn from our human proofreaders, and
 ideally we could
 even have OCRs tailored for determined centuries or types of books.
 We could ue machine learning to look for citations within books (for
 example other cited books or authors).¹
 This could improve heavily our library:
 on Internet Archive or Google Books we have millions of books that just
 wait for us to make them
 readable and accessible, and, of course, connect them to Wikipedia, to
 Wikidata, to other Wikisource books.

 IMHO, this is obviously important for GLAMs:
 we could be much more usable and easy for libraries, archives and museums
 that want to upload into Wikisource their texts and books, and make them
 part of our hyperlinked library.
 They could import easily on Wikisource, and could export as well.
 Now, this is impossible or at least very very difficult.²

 I'm not sure that all these features could go in just one project, but
 it's probably worth trying.

 Aubrey

 [1] I remember I explored the idea with Amir, but I couldn't follow up.
 [2] To get all the data I needed from Wikisource books, I had to
 basically scrape the website.

 On Mon, Mar 20, 2017 at 8:14 PM, Pine W &lt;wiki.pine(a)gmail.com&gt; wrote:

  Glad to see this discussion. Pinging Alex Stinson
for this discussion in
 case he has any insights to add from a GLAM perspective.

 Pine

 On Mon, Mar 20, 2017 at 7:48 AM, David Cuenca Tudela &lt;dacuetu(a)gmail.com&gt;
 wrote:

  On Sun, Mar 19, 2017 at 9:44 PM, Asaf Bartov
&lt;abartov(a)wikimedia.org&gt;
 wrote:

> what might be the significant role our unique advantage might play in
> 15 years?
>

 There are some circumstantial aspects that might be relevant for
 Wikisource:
 - With the emergence of machine learning, do volunteers really need to
 spend so much time formatting? Or will we able to use our data to train a
 system to do some pre-formatting for us?
 - With the existing flood of data, can we consider ws as a relevancy
 setter? If a document has been transcribed/imported into wikisource, is
 that enough to make the document relevant?
 - Considering that not all libraries might have the resources to
 develop their own platform, can Wikisource be used as a neutral platform by
 external agents as a complement to their own infrastructure?

 Regarding the 15 years time frame, it might be a good exercise to
 examine different scenarios. Yes, one could be to think big, to expect
 growth and a favorable environment. But what about the opposite? What if
 there are *less* people able to contribute?

 Cheers,
 Micru

 _______________________________________________
 Wikisource-l mailing list
 Wikisource-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikisource-l

 _______________________________________________
 Wikisource-l mailing list
 Wikisource-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikisource-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikisource-l] Wikimedia Strategy