After Wikimania, I took a trip to Toronto, where the Internet Archive has a large book scanning center. If you have worked on Wikisource, I think you know that many of the books there come from the Internet Archive, which provides scanned images and OCR text in the form of a Djvu file, which can be proofread and presented in Wikisource.
I went there to learn more about how to scan books, but another possibility is to use their existing facilities, which currently are not used at full capacity.
The way the collaboration is set up, the room is provided by the University of Toronto, but the equipment and staff belong to the Internet Archive. The participating libraries around Canada decide which books to digitize and pay a small fee to the Internet Archive for scanning them.
This opens up an interesting opportunity. If we want a particular book to be digitized, and we can find it in the library catalogs of the University of Toronto, it might be possible (in theory, at least) for us to present this wish and perhaps provide the money needed, either directly from the Wikimedia Foundation, or its chapters. Canada is a country with many immigrants and the library system has books in many languages.
This brings me to the question: Which books would be the most important to scan, to help Wikipedia?
For the Swedish and Danish Wikipedia, I know this are the out-of-copyright encyclopedias from the early 20th century, that I have already digitized in 2003 and 2008, respectively.
The Czech "Ottuv slovnik" has already been digitized by Google, as the Wikipedia article points out, http://cs.wikipedia.org/wiki/Ott%C5%AFv_slovn%C3%ADk_nau%C4%8Dn%C3%BD But are the black-and-white scans good enough, or would the Czech community be interested in Internet Archive quality scans in color? If so, we have to figure out if the Internet Archive will scan it for free, if the UofT library will pay, if the Czech chapter can pay (some chapters have money, but are not allowed to send donated money abroad, because of tax deductions), or perhaps the WMF. I saw the Ottuv slovnik on the shelves, in the same building as the scanning stations. It's just waiting for somebody to piece the puzzle together.
But let's begin with a wish list of which books to scan. Of course they need to be out-of-copyright to fit in the Internet Archive, Wikimedia Commons, and in Wikisource. Perhaps illustrated works are more interesting than text?
This would be a GLAM + wiki cooperation that cuts across national borders.
On 07/20/12 6:50 PM, Lars Aronsson wrote:
After Wikimania, I took a trip to Toronto, where the Internet Archive has a large book scanning center. If you have worked on Wikisource, I think you know that many of the books there come from the Internet Archive, which provides scanned images and OCR text in the form of a Djvu file, which can be proofread and presented in Wikisource.
I went there to learn more about how to scan books, but another possibility is to use their existing facilities, which currently are not used at full capacity.
Excellent side-trip on your part!!!
The way the collaboration is set up, the room is provided by the University of Toronto, but the equipment and staff belong to the Internet Archive. The participating libraries around Canada decide which books to digitize and pay a small fee to the Internet Archive for scanning them.
How small is small?
This opens up an interesting opportunity. If we want a particular book to be digitized, and we can find it in the library catalogs of the University of Toronto, it might be possible (in theory, at least) for us to present this wish and perhaps provide the money needed, either directly from the Wikimedia Foundation, or its chapters. Canada is a country with many immigrants and the library system has books in many languages.
In due course, this is certainly something that would interest Wikimedia Canada. Multiculturalism has for some years been a key element of social policy.
This brings me to the question: Which books would be the most important to scan, to help Wikipedia?
For the Swedish and Danish Wikipedia, I know this are the out-of-copyright encyclopedias from the early 20th century, that I have already digitized in 2003 and 2008, respectively.
The Czech "Ottuv slovnik" has already been digitized by Google, as the Wikipedia article points out, http://cs.wikipedia.org/wiki/Ott%C5%AFv_slovn%C3%ADk_nau%C4%8Dn%C3%BD But are the black-and-white scans good enough, or would the Czech community be interested in Internet Archive quality scans in color? If so, we have to figure out if the Internet Archive will scan it for free, if the UofT library will pay, if the Czech chapter can pay (some chapters have money, but are not allowed to send donated money abroad, because of tax deductions), or perhaps the WMF. I saw the Ottuv slovnik on the shelves, in the same building as the scanning stations. It's just waiting for somebody to piece the puzzle together.
But let's begin with a wish list of which books to scan. Of course they need to be out-of-copyright to fit in the Internet Archive, Wikimedia Commons, and in Wikisource. Perhaps illustrated works are more interesting than text?
This would be a GLAM + wiki cooperation that cuts across national borders.
Some countries that do not allow sending donated money abroad may look upon the matter differently if the purpose is to repatriate heritage that is no longer available in the home country.
Out-of-copyright where? Is the Internet Archive willing to scan books that are still protected by copyright in the United States without necessarily putting them online itself? When considered in terms of Canadian copyright law there is considerable material published between 1923 and 1961 that we could accept at Wikilivres.
My personal interest is in periodical publications, and these are more problematical than monographs in any country whose copyrights are based on the date of a person's death.
Illustrated works don't have any particular interest for me. In many cases the illustration appearing in a publication is the only available option. At one time I was looking at a series of 1890s articles in "McClure's Magazine" about the world's 100 most famous paintings. with black-and-white illustrations. What a difference it would make if these texts could be made available with an alternative modern colour version of the illustrations.
Ray
I'd like this to be scanned, to start with: http://go.utlib.ca/cat/1211661 ("L'illustrazione italiana"). It's a few dozens volumes; I've seen only one my grandma had and which looked helpful.
Nemo
Federico Leva (Nemo), 23/07/2012 20:41:
I'd like this to be scanned, to start with: http://go.utlib.ca/cat/1211661 ("L'illustrazione italiana"). It's a few dozens volumes; I've seen only one my grandma had and which looked helpful.
Lars, any news here? Should I contact them directly?
Nemo
Hey Lars,
On 20-Jul-2012, at 7:50 PM, Lars Aronsson wrote:
The way the collaboration is set up, the room is provided by the University of Toronto, but the equipment and staff belong to the Internet Archive. The participating libraries around Canada decide which books to digitize and pay a small fee to the Internet Archive for scanning them.
That sounds awesome! If the book is related to biodiversity, though, it might be easier to ask the Biodiversity Heritage Library to scan it for you -- they take requests (see http://biodiversitylibrary.org/Feedback.aspx), have access to multiple libraries across the US, support multiple languages, and all their content is mirrored to the Internet Archive. I don't know if they'd be as fast as the Internet Archive in Toronto, though! Full disclosure: I'm doing a project with the BHL over the summer.
This opens up an interesting opportunity. If we want a particular book to be digitized, and we can find it in the library catalogs of the University of Toronto, it might be possible (in theory, at least) for us to present this wish and perhaps provide the money needed, either directly from the Wikimedia Foundation, or its chapters. Canada is a country with many immigrants and the library system has books in many languages.
This brings me to the question: Which books would be the most important to scan, to help Wikipedia?
Here are some possibilities from the English Wikisource: http://en.wikisource.org/wiki/Wikisource:Requested_texts -- that page also has interwiki links to the other Wikisources as well.
All the best with your wishlist!
cheers, Gaurav
Hi Lars and Gaurav,
I would say that the first things to be scanned would be, as Gaurav said, the list of requested texts. I think this not necessarily because they are the most important, but the most likely to be worked on and become immediately useful.
Max Klein Wikipedia in Residence kleinm@oclc.org +17074787023
-----Original Message----- From: Gaurav Vaidya [mailto:gaurav@ggvaidya.com] Sent: Sunday, July 22, 2012 12:51 PM To: discussion list for Wikisource,the free library Cc: Wikimedia Chapters cultural partners coordination Subject: Re: [Wikisource-l] Which books to scan to support Wikipedia
Hey Lars,
On 20-Jul-2012, at 7:50 PM, Lars Aronsson wrote:
The way the collaboration is set up, the room is provided by the University of Toronto, but the equipment and staff belong to the Internet Archive. The participating libraries around Canada decide which books to digitize and pay a small fee to the Internet Archive for scanning them.
That sounds awesome! If the book is related to biodiversity, though, it might be easier to ask the Biodiversity Heritage Library to scan it for you -- they take requests (see http://biodiversitylibrary.org/Feedback.aspx), have access to multiple libraries across the US, support multiple languages, and all their content is mirrored to the Internet Archive. I don't know if they'd be as fast as the Internet Archive in Toronto, though! Full disclosure: I'm doing a project with the BHL over the summer.
This opens up an interesting opportunity. If we want a particular book to be digitized, and we can find it in the library catalogs of the University of Toronto, it might be possible (in theory, at least) for us to present this wish and perhaps provide the money needed, either directly from the Wikimedia Foundation, or its chapters. Canada is a country with many immigrants and the library system has books in many languages.
This brings me to the question: Which books would be the most important to scan, to help Wikipedia?
Here are some possibilities from the English Wikisource: http://en.wikisource.org/wiki/Wikisource:Requested_texts -- that page also has interwiki links to the other Wikisources as well.
All the best with your wishlist!
cheers, Gaurav
wikisource-l@lists.wikimedia.org