Brian wrote:
2009/6/23 Samuel Klein meta.sj@gmail.com
Yes, but my understanding is that while google provided part of the mbp data and scans, its continued updates to ocr since then are not being shared. I would be glad to learn this was not the case...
The dataset you need to train an OCR system to be as good as theirs is the raw images and the plain text. They aren't making it easy to get either of those things :( They have presumably improved the software in other ways as well..
WTF GOOG?
Well, when your shorthand uses their stock ticker symbol, your argument has already been coopted.
--Michael Snow