Re: [Wikisource-l] [Commons-l] Image donations?

25 Jun 2006

Ryan Dabler wrote:

...
  On 6/16/06, SJ &lt;2.718281828(a)gmail.com&gt; wrote:

 Museums are good repositories of such information; also non-digitized
 archives.  For them digitization is an expense; if we can reliably
 offer this for free, many will be glad to release copyright in
 exchange for more usable access to their own materials.

 The Library of Congress has a sizable collection of materials that
 they want to distribute more broadly; it is indeed already PD or
 equivalent, but not digitized -- or more commonly, digitized somehow
 but not in many formats, not classified, not easily available.

 A commons-project to create form requests and a queue for processing
 inbound content would be useful.

 You could say the same about archived books that have no commercial
 value anymore.  The same analysis goes for processing book materials
 donated to wikisource; which requires image processing and OCR and
 should perhaps have a commons aspect (raw page images, raw ocr output
 files, images from within the book extracted from the raw page
 images), and a wikisource text aspect (text transcript, translations).
 And again ties to the book industry would be useful here.  
 This kind of sounds like a Google Books sort of deal (well, the 
 portion of
 GB which is actually public domain books).  People scan in books, we take
 the scans and present them for free to the world.  Am I right in the
 assessment?  I didn't quite understand what was being stated.

 Anyhow, I think such a proposal would be very exciting, especially if we
 took the scans and had a decent OCR program to convert it to text, 
 proof it,
 and present in on Wikisource.  And of course, taking anything from the 
 LoC
 would practically double (extremely conservative estimate--not sure 
 how much
 they'd be willing to give) our current database.  
There is no shortage of material that could or should be included   A 
very large proportion of the Google Books material id still not 
available.even after the most conservative application of copyright 
law.  US Government publications dating before 1923 are still only 
available in snippets.  It could very well be a part of their agenda to 
make these available only for a fee payable to them.  Copyright 
notwithstanding, being a unique source of useful material can be a 
lucrative venture for Google.  Big as the combined Wikimedia projects 
may already be we are still far from being able to provide adequate 
competition to Google Books. 

Taking "Scientific American" alone as an example, 16 pages a week for 77 
years (1845-1922) yields over 64,000 pages, and these are generally 
large 11" by 16" pages.   Even the most conservative estimates of the 
amount of freely available material is staggering.  To do it justice may 
require a co-operative effort of all organizations interested in making 
this work freely available.

Ec

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikisource-l] [Commons-l] Image donations?