Дана Saturday 20 June 2009 18:29:24 Brian написа:
This has reminded me to complain about Google Books. Google has the world's best OCR (in virtue of having the largest OCR'able dataset) and also has a mission to scan in all the public domain books they can get their hand on. They recently updated their interface to, as they put it, "make it easier to find our plain text versions of public domain books. If a book is available in full view, you can click the 'Plain text' button in the toolbar." Unfortunately the only way I've found to download the full text of a public domain book from Google is to flip through the book a page at a time, copying the text to your clipboard.
Often, these books are available in the Million Books Project too.