Andrew Gray wrote:
2009/8/1 John Vandenberg:
On Sat, Aug 1, 2009 at 5:09 PM, Samuel Klein wrote:
Also... *A wiki for book metadata, with an entry for every published work, statistics about its use and siblings, and discussion about its usefulness as a citation (a collaboration with OpenLibrary, merging WikiCite ideas)
Why not just do this in the Wikisource project?
99% percent of "every published work" are free/libre. Only the last 70 years worth of texts are restricted by copyright, so it doesnt make sense to build a different project for those works.
I think your estimate's a little off, sadly :-)
Firstly, copyright lasts more than the statutory seventy years, as a general rule - remember, authors don't conveniently die the moment they publish. If we discount the universal one-date cutoff in the US eighty years ago - itself a fast-receding anomaly - extant copyrights probably last about a hundred years from publication, on average.
But more critically, whilst a hundred years is a drop in the bucket of the time we've been writing texts, it's a very high proportion of the time we've been publishing them at this rate. Worldwide, book publication rates now are pushing two orders of magnitude higher than they were a century ago, and that was itself probably up an order of magnitude on the previous century. Before 1400, the rate of creation of texts that have survived probably wouldn't equal a year's output now.
I don't have the numbers to hand to be confident of this - and hopefully Open Library, as it grows, will help us draw a firmer conclusion - but I'd guess that at least half of the identifiable works ever conventionally published as monographs remain in copyright today. 70% wouldn't surprise me, and it's still a growing fraction.
Intuitively, I think your analysis is closer to reality, but, even so, that older 30% is more than enough to keep us all busy for a very long time. To appreciate the size of the task consider the 1911 Encyclopædia Britannica. It is well in the public domain, and most articles there have a small number of sources which themselves would be in the public domain. Only a small portion of the 1911 EB project on Wikisource is complete to acceptable standards; we have virtually nothing from EB's sources; we also have virtually nothing from any other edition of the EB even though everything up to the early 14th (pre 1946) is already in the public domain. Dealing with this alone is a huge task.
Having all this bibliography on Wikisource is conceivable, though properly not in the Wikisource of any one language; that would be consistent with my own original vision of Wikisource from the very first day. A good bibliographic survey of a work should reference all editions and all translations of a work. For an author, multiply this by the number of his works. Paradoxically, Wikisource, like Wikipedia and like many another mature projects, has made a virtue of obsessive minute accuracy and uniformity. While we all treasure accuracy, its pursuit can be subject to diminishing returns. A bigger Wikisource community could in theory overcome this, but the process of acculturation that goes on in mature wiki projects makes this unlikely.
Sam's reference to "book metadata" is itself an underestimate of the challenge. It doesn't even touch on journal articles, or other material too short to warrant the publication of a monograph.
Ec