[WikiEN-l] Wikipedia reaches 3 millionth article

Charles Matthews charles.r.matthews at ntlworld.com
Wed Aug 19 14:53:42 UTC 2009


Carcharoth wrote:
> How do Google Books and libraries and Project Gutenberg and others do
> mass scanning and OCR of books? Do they use lots of money and funding
> to pay lots of people to do lots of scanning on lots of machines, or
> do they automate it in some way?
>   
Google apparently pays peanuts and they certainly didn't automate in the 
past - I spend an unconscionable amount of time gettimg round bad Google 
scans, very many of which have parts of the page obscured by a person's 
hand. I'm stunned that they don't ask for repeat scans of some unusable 
pages. (They may have been on a learning curve.)

Charles




More information about the WikiEN-l mailing list