[Wikipedia-l] Central bibliography

Ray Saintonge saintonge at telus.net
Thu Sep 7 20:20:50 UTC 2006


Lars Aronsson wrote:

>Ray Saintonge wrote:
>  
>
>>I don't see where copyright is an issue with this.  The Library 
>>of Congress is an arm of the United States Congress whose 
>>primary purpose is to serve U. S. legislators.  That would put 
>>its work in the public domain.  Is there any reason to believe 
>>otherwise?
>>    
>>
>Why don't I see any downloadable dump of their entire database? 
>Providing that would be a great goal for the Wikimedia Foundation. 
>
I think that the answer may be quite innocent.  Until Wikimedia came 
along who would have wanted the entire database?  If the demand didn't 
exist, they would have no reason to make it available.

>Here we're freeing the encyclopedia, news reporting, pictures, and 
>why not the library catalog.  Just think about being able to 
>importing it to MySQL or PostgreSQL on your own computer, and then 
>do things like "select count(*)" to find which people translated 
>most works from Croatian to Hungarian, and make a [[List of 
>translators from Croatian to Hungarian]], so we can make sure we 
>have encyclopedia articles for the 50 most active ones.
>
Again before such a task was undertaken someone had to imagine that it 
could be done.  As long as the list had to be created manually, the task 
was for all practical purposes impossible.  There are surely many other 
databases that need freeing, and they could be just as free if someone 
else were doing the freeing.  If that other databse allows you at no 
cost to search in such a way that you can find the information you want 
is it not effectively free? 

>Today I can download the LoC catalog one MARC record at a time 
>through a Z39.50 interface.  So far, I'm not aware of anyone who 
>copied the entire catalog this way and provided it for free 
>download.  If we had a copy, would the Wikimedia Foundation 
>provide it for download?  What does the legal councel or 
>foundation board say?  Do we need a written permission as a legal 
>security, or can we simply trust that these U.S. government data 
>are in the public domain?  Are they in fact U.S. government data, 
>or were they licensed from other sources, and under which terms?
>
While it's a good thing to investigate these questions more thoroughly, 
it would be pointless if proposal were technically impossible.  I have 
been looking through http://www.loc.gov/z3950/agency/ where LC is 
indicated as the maintenance agencey for Z39.50/ISO 23950.  Nowhere have 
I yet found any mention of copyright for the standard on the site.

This may cover the standard and formats, but what about the content of 
any particular entry?  I would venture to say that it is not 
copyrightable.  Copyright applies to the expression of information, and 
not the information itself.  If the form of expression is predictable, 
as in conforming to a public domain standard the result would not be 
copyrightable. 

One of the greatest threats to open access is the belief that something 
is protected by copyright when it isn't.  Any fair use claim presumes 
that the material used is copyright protected in the first place.  If 
the underlying material is not protected a fair use claim is redundant.

Things that I have looked at while trying to answer this
    http://www.earlham.edu/~peters/fos/newsletter/03-02-06.htm#collateral
    http://www.loc.gov/standards/relreport.pdf
    http://www.dlib.org/dlib/march00/coyle/03coyle.html

>>Other libraries may have different views concerning their 
>>material, but how much of their material is not in the LoC 
>>catalogue.
>>    
>>
>While the LoC catalog is huge in the number of records, and 
>providing it for free download would be a great achievement, the 
>assumption that it could replace every other library catalog is 
>naive.  For the example above, the LoC rarely catalogs which 
>people translated between which languages.  That information (for 
>Croatian-Hungarian) is probably only in the catalog of Hungary's 
>national library.  For Hofstadter's famous "Gödel, Escher, Bach" 
>LoC only finds three hits for three English editions, but none of 
>this book's many translations to other languages.  The German 
>national bibliography shows 2 English editions, a dozen German 
>printings, and 1 each in Dutch, Danish, and Spanish.  The Dutch 
>Royal Library lists two English and five Dutch printings, but the 
>last one is documented as being the 9th printing, so the catalog 
>in fact only covers half of what's been published.  Many Dutch 
>Wikipedians are likely to own copies of the other printings, and 
>could provide the missing information if the database was Wikicat. 
>And these are only languages that are close to English and well 
>represented at the Library of Congress.
>
The Hofstadter example is a good one in that it warns us of the dangers 
of simplistic reduction.  Many of our online colleagues seem to be 
motivated by some desire to make tasks easier.  This is often done by 
ignoring embarassing complexities.

>This takes us back to explaining the basics of library & 
>information science.  We should have a mailing list specialized on 
>Wikicat and how to free the bibliography.
>
Perhaps, although I'm not sure we're ready for yet another mailing 
list.  Full scale freeing of bibliographies can easily lead us into what 
amounts to a Union Catalog of private holdings. 

Ec





More information about the Wikipedia-l mailing list