On 11/08/2012 01:53 PM, Ole Palnatoke Andersen wrote:
The book has been digitized now. I can see it at http://www.kb.dk/e-mat/dod/130019427200.pdf
Looking at this file, pdfinfo says it was created by Finereader Recognition Server and that the page size is 595.44 × 841.68 pts. If "pts" is 1/72 of an inch, this would mean 8.27 × 11.69 inches, close to letter size, which is clearly unrealistic. I would guess that the pages of the physical book is roughly half of that or 4-5 × 6-7 inches. The images are 1170 × 1873 pixels, and I would estimate the scanning resolution to be in the range 250 to 300 dpi. That's good enough.
The included OCR text looks like this for a text page (page 20 of the PDF, paginated -12-):
ventede den, men meente jeg forstoed ncr- sten alt Norsk, fisen jeg t mine yngre Aar ogsaa havde varet her nogen tort Tltd, og da langt lcrttere kom til rette t daglig Samtale. Imidlertid blev jkg snart vaec at Adflilligheven beroede paa den Kster og Vesterlandske äisleÄcs heel store og
It's not bad that it read "forstoed", with the long "s" and the old spelling "oe". But on the second line "siden" was read as "fisen", which is incorrect. Not a single "æ" is correct, which is odd for trying to recognize Danish, while "ä" erroneously appears on the last line of this excerpt. This indicates that it really tries to recognize the German alphabet, albeit with a Danish dictionary.
This is "the usual quality" for OCR of blackletter (fraktur), and not radically good.
I uploaded the work to http://runeberg.org/glossnor/ with the OCR text provided. It's now ready for your proofreading.
wikisource-l@lists.wikimedia.org