[Foundation-l] Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship
Michael Snow
wikipedia at verizon.net
Tue Jun 23 17:44:56 UTC 2009
Brian wrote:
> 2009/6/23 Samuel Klein <meta.sj at gmail.com>
>
>> Yes, but my understanding is that while google provided part of the mbp
>> data
>> and scans, its continued updates to ocr since then are not being shared. I
>> would be glad to learn this was not the case...
>>
> The dataset you need to train an OCR system to be as good as theirs is the
> raw images and the plain text. They aren't making it easy to get either of
> those things :( They have presumably improved the software in other ways as
> well..
>
> WTF GOOG?
>
Well, when your shorthand uses their stock ticker symbol, your argument
has already been coopted.
--Michael Snow
More information about the foundation-l
mailing list