Let's not forget the thousands of illustrations that we will now have access to as a result of this.
Tim Starling wrote:
Brian wrote:
For those who don't know, the 1911 Encyclopaedia Britannica is a famous public domain encyclopedia, advertised as the "sum of all human knowledge" in 1911.
I recently (today) acquired a DVD containing scans of every page of the 1911 Britannica, along with index files for it all, organized by letter and page number. I've already talked with avar, TimStarling, and brion on IRC, and TimStarling specifically asked me to tell you all that he is "confident that the server requirements will be minimal." They would set up a domain name, generate some web pages automatically using the index files, and host the entire set of 29,700 files totaling about 4 GB.
One more thing, these are black and white TIFs, and there is discussion about whether they should be mass converted to PNGs to be easily viewable.
A few notes on this: firstly it seems that the guy who made the scans has no intention of claiming any rights to them. He seems to be interested in disseminating the material widely, for religious reasons. His webpage is here:
http://freierscientologe.netfirms.com/booksbritannica.htm
The CD/DVD sets are apparently quite rare, Brian was lucky to get his hands on one at a fairly cheap price.
There's the trademark issue -- Britannica may attempt to scare us with legal threats over this. A disclaimer on every HTML page declaring non-affiliation with Britannica would probably put us on sound legal footing, although I'd be willing to hear advice about this from people who are more knowledgeable. If the "LoveToKnow Free Online Encyclopedia" (1911encyclopedia.org) can host this content, then we should be able to find a way too. And we can do it without the abominable license restrictions and "copyright traps" scattered throughout the work to enforce them.
Wikipedia owes a lot to the 1911 edition -- we've copied many of its articles. A public, canonical copy will be a valuable tool to deal with LoveToKnow's frequent OCR errors, its incompleteness, and its specious legal threats against us based on our use of unspecified copyright material hidden in their doctored online copy. Hopefully the availability of page images will spur development of a complete and accurate OCR copy.
The only question in my mind is the domain: should this be under eb1911.wikipedia.org? We could make it visually distinct, to avoid confusion with Wikipedia itself. Or would eb1911.wikimedia.org be better? Or eb1911.wikisource.org?
-- Tim Starling
foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l