Museums are good repositories of such information; also non-digitized archives. For them digitization is an expense; if we can reliably offer this for free, many will be glad to release copyright in exchange for more usable access to their own materials.
The Library of Congress has a sizable collection of materials that they want to distribute more broadly; it is indeed already PD or equivalent, but not digitized -- or more commonly, digitized somehow but not in many formats, not classified, not easily available.
A commons-project to create form requests and a queue for processing inbound content would be useful.
You could say the same about archived books that have no commercial value anymore. The same analysis goes for processing book materials donated to wikisource; which requires image processing and OCR and should perhaps have a commons aspect (raw page images, raw ocr output files, images from within the book extracted from the raw page images), and a wikisource text aspect (text transcript, translations). And again ties to the book industry would be useful here.
Finally, source texts that are educationally useful could generate a third set of materials : living wikibooks built on their foundation, updated and improved over time.
SJ <copynig all 3 project lists>
On 6/15/06, Magnus Manske magnus.manske@web.de wrote:
I was wondering if there is some kind of organized effort to ask photographers and image agencies for donations (read: GFDL- or CC-licensing) of images.
I am thinking especially of images that we cannot take ourselves; dead celebrities for example (and no, don't go grave-digging ;-)
There must be a huge amount of photos that have next no no commercial value anymore, because they are not good enough for a magazine cover, but would do well for documenting an encyclopedia article. Of course, we would prominently credit the source in the image description (which will be transcluded to every wikipedia that uses it), or even in the image title. Images could be watermarked, of course, and for largeer amounts of photos, we'd create a category, gallery and all. Repeaded mentioning (in a good light!) in a project of the wikimedia magnitude might be worth more than paid advertisement, fo virtually no cost.
We could even offer a service: I'm sure some of us have (semi-)professional film scanners (I do). Deal goes like this: mail us your films (encyclopedia/commons-style only; not your family picknick;-) and a note that releases them under GFDL/CC/PD/whatever, and we'll upload them in high-res on commons, where you can download them. Free film digitization!
With people on commons obviously interested in media, there must be some of us with ties to "the industry" who can initiate such contacts. "The Yorck Project" already donated a lot of PD images, as you might remember. If we can get just a few photographers/companies to release images as well, others might follow just to not lag behind.
Magnus _______________________________________________ Commons-l mailing list Commons-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/commons-l
It would be very exciting to get our hands on uncommon materials such as the Libary of Congress must have. However I am confused about your proposal about "A commons-project to create form requests and a queue for processing inbound content would be useful." What do you dislike about the current methods being used to put scanned images on Commons and attaching these to the OCR text at Wikisoure? I personally have never used these scripts myself, but judging from the results alone I am not sure how it would be improved on. Looking at the histories of the pages it seems to work rather quickly as well. It would be very useful if you could give us details about what problems you see with the current system, before wwe start talking of a new set-up.
As to educational materials for Wikibooks Wikisource does contain a lot of things which would be useful as a base for updated textbooks. I do not know what anyone at Wikibooks is particularly interested in, but there is some room for collaboration. Especially with the WikiJunior people IMHO. If anyone at those projects is interested in what we have email me I will gather a list for you of what I think might useful. Sometime in the future I think it would be great if WikiJunior expanded into having a set of purely literary books available. Although it would not be actually adding new content to the world (except in translation perhaps), putting them together would require very little effort. Finding free content illustrations would probably be the most work of anything.
Birgitte SB
--- SJ 2.718281828@gmail.com wrote:
Museums are good repositories of such information; also non-digitized archives. For them digitization is an expense; if we can reliably offer this for free, many will be glad to release copyright in exchange for more usable access to their own materials.
The Library of Congress has a sizable collection of materials that they want to distribute more broadly; it is indeed already PD or equivalent, but not digitized -- or more commonly, digitized somehow but not in many formats, not classified, not easily available.
A commons-project to create form requests and a queue for processing inbound content would be useful.
You could say the same about archived books that have no commercial value anymore. The same analysis goes for processing book materials donated to wikisource; which requires image processing and OCR and should perhaps have a commons aspect (raw page images, raw ocr output files, images from within the book extracted from the raw page images), and a wikisource text aspect (text transcript, translations). And again ties to the book industry would be useful here.
Finally, source texts that are educationally useful could generate a third set of materials : living wikibooks built on their foundation, updated and improved over time.
SJ <copynig all 3 project lists>
On 6/15/06, Magnus Manske magnus.manske@web.de wrote:
I was wondering if there is some kind of organized
effort to ask
photographers and image agencies for donations
(read: GFDL- or
CC-licensing) of images.
I am thinking especially of images that we cannot
take ourselves; dead
celebrities for example (and no, don't go
grave-digging ;-)
There must be a huge amount of photos that have
next no no commercial
value anymore, because they are not good enough
for a magazine cover,
but would do well for documenting an encyclopedia
article. Of course, we
would prominently credit the source in the image
description (which will
be transcluded to every wikipedia that uses it),
or even in the image
title. Images could be watermarked, of course, and
for largeer amounts
of photos, we'd create a category, gallery and
all. Repeaded mentioning
(in a good light!) in a project of the wikimedia
magnitude might be
worth more than paid advertisement, fo virtually
no cost.
We could even offer a service: I'm sure some of us
have
(semi-)professional film scanners (I do). Deal
goes like this: mail us
your films (encyclopedia/commons-style only; not
your family picknick;-)
and a note that releases them under
GFDL/CC/PD/whatever, and we'll
upload them in high-res on commons, where you can
download them. Free
film digitization!
With people on commons obviously interested in
media, there must be some
of us with ties to "the industry" who can initiate
such contacts. "The
Yorck Project" already donated a lot of PD images,
as you might
remember. If we can get just a few
photographers/companies to release
images as well, others might follow just to not
lag behind.
Magnus _______________________________________________ Commons-l mailing list Commons-l@wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/commons-l
-- ++SJ _______________________________________________ Wikisource-l mailing list Wikisource-l@mail.wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/wikisource-l
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
On 6/16/06, SJ 2.718281828@gmail.com wrote:
Museums are good repositories of such information; also non-digitized archives. For them digitization is an expense; if we can reliably offer this for free, many will be glad to release copyright in exchange for more usable access to their own materials.
The Library of Congress has a sizable collection of materials that they want to distribute more broadly; it is indeed already PD or equivalent, but not digitized -- or more commonly, digitized somehow but not in many formats, not classified, not easily available.
A commons-project to create form requests and a queue for processing inbound content would be useful.
You could say the same about archived books that have no commercial value anymore. The same analysis goes for processing book materials donated to wikisource; which requires image processing and OCR and should perhaps have a commons aspect (raw page images, raw ocr output files, images from within the book extracted from the raw page images), and a wikisource text aspect (text transcript, translations). And again ties to the book industry would be useful here.
Finally, source texts that are educationally useful could generate a third set of materials : living wikibooks built on their foundation, updated and improved over time.
SJ
This kind of sounds like a Google Books sort of deal (well, the portion of GB which is actually public domain books). People scan in books, we take the scans and present them for free to the world. Am I right in the assessment? I didn't quite understand what was being stated.
Anyhow, I think such a proposal would be very exciting, especially if we took the scans and had a decent OCR program to convert it to text, proof it, and present in on Wikisource. And of course, taking anything from the LoC would practically double (extremely conservative estimate--not sure how much they'd be willing to give) our current database.
Z
Ryan Dabler wrote:
On 6/16/06, SJ 2.718281828@gmail.com wrote:
Museums are good repositories of such information; also non-digitized archives. For them digitization is an expense; if we can reliably offer this for free, many will be glad to release copyright in exchange for more usable access to their own materials.
The Library of Congress has a sizable collection of materials that they want to distribute more broadly; it is indeed already PD or equivalent, but not digitized -- or more commonly, digitized somehow but not in many formats, not classified, not easily available.
A commons-project to create form requests and a queue for processing inbound content would be useful.
You could say the same about archived books that have no commercial value anymore. The same analysis goes for processing book materials donated to wikisource; which requires image processing and OCR and should perhaps have a commons aspect (raw page images, raw ocr output files, images from within the book extracted from the raw page images), and a wikisource text aspect (text transcript, translations). And again ties to the book industry would be useful here.
This kind of sounds like a Google Books sort of deal (well, the portion of GB which is actually public domain books). People scan in books, we take the scans and present them for free to the world. Am I right in the assessment? I didn't quite understand what was being stated.
Anyhow, I think such a proposal would be very exciting, especially if we took the scans and had a decent OCR program to convert it to text, proof it, and present in on Wikisource. And of course, taking anything from the LoC would practically double (extremely conservative estimate--not sure how much they'd be willing to give) our current database.
There is no shortage of material that could or should be included A very large proportion of the Google Books material id still not available.even after the most conservative application of copyright law. US Government publications dating before 1923 are still only available in snippets. It could very well be a part of their agenda to make these available only for a fee payable to them. Copyright notwithstanding, being a unique source of useful material can be a lucrative venture for Google. Big as the combined Wikimedia projects may already be we are still far from being able to provide adequate competition to Google Books.
Taking "Scientific American" alone as an example, 16 pages a week for 77 years (1845-1922) yields over 64,000 pages, and these are generally large 11" by 16" pages. Even the most conservative estimates of the amount of freely available material is staggering. To do it justice may require a co-operative effort of all organizations interested in making this work freely available.
Ec
We could even offer a service: I'm sure some of us have (semi-)professional film scanners (I do). Deal goes like this: mail us your films (encyclopedia/commons-style only; not your family picknick;-) and a note that releases them under GFDL/CC/PD/whatever, and we'll upload them in high-res on commons, where you can download them. Free film digitization!
On the subject of digitizing, you might be interested by the project Wikimedia Deutschland just completed, that of digitizing a 16th century book for Wikisource.
http://www.wikimedia.de/index.php?p=127
It might be an interesting thing to have the chapters and/or the Foundation vouch for the people proposing this kind of service. I believe not everyone is ready to give their pics/files whatever to people they've never seen and who just say "I have a scanner, I can do this for you". There are practical issues to take into consideration which go from conservations of the documents to returning them to their owner.
Compiling a list of people interested by such a project though would be indeed a great idea, and commons probably is the place to start.
Delphine
wikisource-l@lists.wikimedia.org