Scan quality is excellent.

Yes, is a very promising way - my suggestion is, to get always scans in TIFF (if possible; they are large but USB are large too ...), tro transform them into an image-only pdf (which is the simpler tool to do this?)  and to load a copy into Internet Archive specifyng both the library where the book has been scanned AND the wikisource contribution in scansion/merging TIFFs/uploading into IA. 

Then the excellent OCR -> divu produced by IA can be downloaded and uploaded into Commons. A good way to share anything, IMHO. 

In the meantime: IA produces too an extremely interesting ABBYY.gz output; it's a xml where a incredible set of interesting data is recorded for any scanned character. Here an example for a random character of a random IA book:

<charParams l="1356" t="680" r="1544" b="884" wordStart="false" wordFromDictionary="true" wordNormal="true" wordNumeric="false" wordIdentifier="false" charConfidence="25" serifProbability="100" wordPenalty="0" meanStrokeWidth="347">G</charParams>

Something to explore deeply  IMHO; I presume that less than 1% of usable ABBYY scan data are wrapped into djvu as OCR layer. 


2013/6/13 Lars Aronsson <>
Some research libraries in Stockholm (at archives and
museums) have put up book scanners that the public
can use. They have the same function as a public
copier, but you get your copies on a USB stick rather
than on paper.

This opens an interesting opportunity for Wikisource and
similar volunteer book scanning projects. Instead of
buying expensive equipment, experimenting with
cameras and lighting, or building your own scanner,
you can just visit such a library. I guess you can even
bring your own book and scan it there, instead of just
using the library's books. (Of course you still need to
consider copyright. That goes without saying.)

Wikimedia Sverige, the Swedish chapter of the WMF,
started a wiki page to document some experience
from this kind of use, in Swedish of course,

Here is an example of a book that was scanned this way,
(Ironically, it is the annual report for 1897 of the museum
where it was scanned. They have the scanner standing in
their own library, but they have not scanned their own

Are you familiar with anyting similar? Any other pages
that we should link to?

  Lars Aronsson (

  Wikimedia Sverige - stöd fri kunskap -

  Project Runeberg - free Nordic literature -

Wikisource-l mailing list