2008/10/2 David Gerard dgerard@gmail.com:
2008/10/2 Andrew Gray shimgray@gmail.com:
An Irish study gives the neat - but high! - estimate that to digitise two years worth of a single newspaper title would take about one year and ten staff at a cost of ~300,000 EUR, of which 10% would be capital investment. I suspect with software automation this could be heavily reduced. www.askaboutireland.ie/resources/OCR_DigitisationAndTranscriptionOfNewspapers.pdf
How much would just raw scans cost?
I suspect on the order of $1-2 a page - the same as the US scans are getting - with the conversion and OCR run as a batch job on the whole set. Goodness only knows who's doing the indexing...
This is outsourcing, anyway - the Irish one seems to be an entirely in-house project, which would presumably explain the high costs.