[Foundation-l] Re: Hosting scans of the 1911 Britannica onWikimedia
Robert Scott Horning
robert_horning at netzero.net
Wed Nov 9 17:41:36 UTC 2005
Anthony DiPierro wrote:
>On 11/9/05, Robert Scott Horning <robert_horning at netzero.net> wrote:
>
>
>>What??? Wikimedia Commons is the best place for images, and indeed
>>there have already been several scans of this encyclopedia that have
>>been put into Wikimedia projects. We don't need to use bit torrents
>>unless this is a move to do bit torrents for all Wikimedia projects
>>(perhaps a good idea but a seperate discussion). There is also a
>>license tag that has been specifically established on commons just for
>>content from the 1911 Encyclopaedia Brittanica because of the large
>>number of potential images that can come from this source. Look them up
>>right now with the associated categories at
>>http://commons.wikimedia.org/wiki/Template%3APD-Britannica
>>
>>
>
> Wikimedia Commons is the best place for images of text? If that's what
>you're saying, I disagree. I think maybe we were talking about two different
>things, though.
>
No, this is still the same issue. I'm not exactly sure where the best
place for scanned pages of historical text ought to go in this case.
The images themselves should be in commons, and perhaps as a temporary
"Wikiproject" within commons to extract those images might be useful to
have the full scanned pages available. Wikisource also has an image
repository independent of commons, so that may be more appropriate, but
that is something that ought to be decided within the Wikisource
community itself. Figures and engravings do need to go to Commons.
>
>One thing I see missing from this discussion is working in cooperation
>
>
>>with Distributed Proofreaders, who is not only transcribing the contents
>>of this encyclopedia into plain ASCII text (and XML markup as well), but
>>is also providing scans of the figures and images from within the
>>volumes and making them available with a public domain license. What
>>more do we want here? The slow going on that project with Distributed
>>Proofreaders is something that goes to show how large of a project it is.
>>
>>
>
> AFAIK Distributed Proofreaders hasn't released the raw images out to the
>public. If that's still the case, I'd say *that* is the reason for the slow
>going. The wiki process would be much more efficient.
>
>That trying to organize the content onto a Wiki has been difficult, yes.
>
It is not that difficult to get the raw image scans from Distributed
Proofreaders if you really want them. They are not of the best quality
(DP has other goals in mind) but they are usable for the purpose of
transcription of the text. I also fail to see how using a Wiki for
proofreading is going to be any better than what DP is doing. Indeed,
the DP standards for proofreading are much higher than anything on any
Wikimedia project, and once it has gone through the proofreading rounds
through DP you can be generally assured of transcription accuracy that
is as good if not better than any other trascription service,
professional or amature. I have seen the efforts of the 1911
Encyclopaedia Britannica project on Wikisource and the efforts to
improve textual fidelity for those articles have been absolutely
miserable, and Distributed Proofreaders does a much, much better job.
All we are trying to do on Wikisource anyway is to do MediaWiki markup
and linkages into existing Wikimedia projects like Wikipedia and
Wiktionary where appropriate, as well as to link back to Wikisource for
historical reference.
The few articles that attempts for cleanup due to the textual source not
coming from DP or Project Gutenberg sources have been frankly a joke and
have incredible errors in the transcription. Performing textual
trascriptions of historical documents is simply not something that
MediaWiki software is set up to deal with except on a very limited
basis. Marking up (adding bold words and italics) and hyperlinks is a
much more appropriate task and something MediaWiki software does very
well, which is the big strength of Wikisource as a project in general.
>
>
>>That is the real issue here, because you can copyright a scan of an
>>image. Weak copyright protection at best, but you can copyright the
>>scan itself which would in turn force you to have to find the original
>>materials and do the scan seperately. In the case of the 1911
>>Encyclopaedia Brittanica, however, that is much easier to do than some
>>other older works. Again, working with the Distributed Proofreaders on
>>something like this is going to make life much easier because they have
>>done the scans themselves and are granting explicitly the scanned images
>>and content into the public domain. It also avoids duplication of labor
>>with a huge project like this.
>>
>>--
>>Robert Scott Horning
>>
>>
>
> I thought scans of 2D public domain images were public domain. I've
>certainly read that on Wikipedia somewhere.
>
>Anthony
>
>
This is an area of copyright law that is still working its way through
the court system. Most notable is the assertion of copyright by museums
on classical artwork and university library special collections
departments who have scanned images of historical works. The only way
they can assert copyright is to claim copyright on the scan or the image
of the artwork, not on the original material itself. For instance,
there is a huge collection of photographs from the University of
Michigan that you can look at here:
http://www.lib.umich.edu/spec-coll/labadie/labadie.html
The Univeristy of Michigan is asserting copyright on the whole
collection, even though many photographs in the collection were
physically made before 1923 and therefore would be considered in the
public domain through copyright expiration. Personally, I think there
are some interesting photos in this collection and I'd like to add them
to the Wikimedia Commons, but in this case because of the copyright
assertion I am very reluctant to do so. In this case, it is very
unlikely that I would gain access to the photos in the special
collections area of that library to do my own scans just for licensing
purposes. I am using this as just an example, but something that would
be useful for all Wikimedia projects and to describe independently the
issues of grabbing images at random and assuming that you have copyright
authority to do with them as you please.
BTW, using Wikipedia as a scholarly reference is hardly a supporting
argument.
--
Robert Scott Horning
More information about the foundation-l
mailing list