[WikiEN-l] Wikipedia reaches 3 millionth article
Charles Matthews
charles.r.matthews at ntlworld.com
Wed Aug 19 14:53:42 UTC 2009
Carcharoth wrote:
> How do Google Books and libraries and Project Gutenberg and others do
> mass scanning and OCR of books? Do they use lots of money and funding
> to pay lots of people to do lots of scanning on lots of machines, or
> do they automate it in some way?
>
Google apparently pays peanuts and they certainly didn't automate in the
past - I spend an unconscionable amount of time gettimg round bad Google
scans, very many of which have parts of the page obscured by a person's
hand. I'm stunned that they don't ask for repeat scans of some unusable
pages. (They may have been on a learning curve.)
Charles
More information about the WikiEN-l
mailing list