Re: [Wikipedia-l] Central bibliography

8 Sep 2006


      Lars Aronsson wrote:
...
Ray Saintonge wrote:
...
I don't see where copyright is an issue with this.  The Library 
of Congress is an arm of the United States Congress whose 
primary purpose is to serve U. S. legislators.  That would put 
its work in the public domain.  Is there any reason to believe 
otherwise?
Why don't I see any downloadable dump of their entire database? 
Providing that would be a great goal for the Wikimedia Foundation.
I think that the answer may be quite innocent.  Until Wikimedia came 
along who would have wanted the entire database?  If the demand didn't 
exist, they would have no reason to make it available.
...
Here we're freeing the encyclopedia, news reporting, pictures, and 
why not the library catalog.  Just think about being able to 
importing it to MySQL or PostgreSQL on your own computer, and then 
do things like "select count(*)" to find which people translated 
most works from Croatian to Hungarian, and make a [[List of 
translators from Croatian to Hungarian]], so we can make sure we 
have encyclopedia articles for the 50 most active ones.
Again before such a task was undertaken someone had to imagine that it 
could be done.  As long as the list had to be created manually, the task 
was for all practical purposes impossible.  There are surely many other 
databases that need freeing, and they could be just as free if someone 
else were doing the freeing.  If that other databse allows you at no 
cost to search in such a way that you can find the information you want 
is it not effectively free?
...
Today I can download the LoC catalog one MARC record at a time 
through a Z39.50 interface.  So far, I'm not aware of anyone who 
copied the entire catalog this way and provided it for free 
download.  If we had a copy, would the Wikimedia Foundation 
provide it for download?  What does the legal councel or 
foundation board say?  Do we need a written permission as a legal 
security, or can we simply trust that these U.S. government data 
are in the public domain?  Are they in fact U.S. government data, 
or were they licensed from other sources, and under which terms?
While it's a good thing to investigate these questions more thoroughly, 
it would be pointless if proposal were technically impossible.  I have 
been looking through http://www.loc.gov/z3950/agency/ where LC is 
indicated as the maintenance agencey for Z39.50/ISO 23950.  Nowhere have 
I yet found any mention of copyright for the standard on the site.
This may cover the standard and formats, but what about the content of 
any particular entry?  I would venture to say that it is not 
copyrightable.  Copyright applies to the expression of information, and 
not the information itself.  If the form of expression is predictable, 
as in conforming to a public domain standard the result would not be 
copyrightable.
One of the greatest threats to open access is the belief that something 
is protected by copyright when it isn't.  Any fair use claim presumes 
that the material used is copyright protected in the first place.  If 
the underlying material is not protected a fair use claim is redundant.
Things that I have looked at while trying to answer this
    http://www.earlham.edu/~peters/fos/newsletter/03-02-06.htm#collateral
    http://www.loc.gov/standards/relreport.pdf
    http://www.dlib.org/dlib/march00/coyle/03coyle.html
...
...
Other libraries may have different views concerning their 
material, but how much of their material is not in the LoC 
catalogue.
While the LoC catalog is huge in the number of records, and 
providing it for free download would be a great achievement, the 
assumption that it could replace every other library catalog is 
naive.  For the example above, the LoC rarely catalogs which 
people translated between which languages.  That information (for 
Croatian-Hungarian) is probably only in the catalog of Hungary's 
national library.  For Hofstadter's famous "Gödel, Escher, Bach" 
LoC only finds three hits for three English editions, but none of 
this book's many translations to other languages.  The German 
national bibliography shows 2 English editions, a dozen German 
printings, and 1 each in Dutch, Danish, and Spanish.  The Dutch 
Royal Library lists two English and five Dutch printings, but the 
last one is documented as being the 9th printing, so the catalog 
in fact only covers half of what's been published.  Many Dutch 
Wikipedians are likely to own copies of the other printings, and 
could provide the missing information if the database was Wikicat. 
And these are only languages that are close to English and well 
represented at the Library of Congress.
The Hofstadter example is a good one in that it warns us of the dangers 
of simplistic reduction.  Many of our online colleagues seem to be 
motivated by some desire to make tasks easier.  This is often done by 
ignoring embarassing complexities.
...
This takes us back to explaining the basics of library & 
information science.  We should have a mailing list specialized on 
Wikicat and how to free the bibliography.
Perhaps, although I'm not sure we're ready for yet another mailing 
list.  Full scale freeing of bibliographies can easily lead us into what 
amounts to a Union Catalog of private holdings.
Ec

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [Wikipedia-l] Central bibliography