[Wikisource-l] Vote for Google OCR-Wikisource integration in 2015 community wishlist

1 Dec 2015


      Hi,
For a long time Indic languages Wikisource projects depended totally
on manual proofreading, which not only wasted a lot of time, but also
a lot of energy. Recently Google has released OCR software for more
than 20 Indic languages, along with other Asian languages. This
software is far far better and accurate than the previous OCRs. But it
has many limitations. Uploading the same large file two times (one
time for Google OCR and another at Commons) is not an easy solution
for most of the contributors, as Internet connection is way slow in
India. Now if we develop a tool which can feed the uploaded pdf or
djvu files of Commons directly to Google OCRs, so that uploading them
2 times can be avoided.
This was proposed in 2015 community wishlist. Now, as the voting
procedure for the wishlist has been started, the proposal needs your
support. Please follow the link-
https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Wikisource#To...
FYI, this proposal was also accepted as a highest priority need at the
2015 Wikisource Conference in Vienna.
(https://etherpad.wikimedia.org/p/wscon2015needs)
Regards
-- 
Bodhisattwa Mandal
Administrator, Bengali Wikipedia

''Imagine a world in which every single person on the planet is given
free access to the sum of all human knowledge.''

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[Wikisource-l] Vote for Google OCR-Wikisource integration in 2015 community wishlist