Hi,
On 11 August 2016 at 15:10, Mardetanha mardetanha.wiki@gmail.com wrote:
best person would be Bodhisattwa Mandal
Mardetanha
Aaaaa, I doubt that. I have not contributed in Hindi Wikisource (which is still in multilingual Wikisource) ever.
On Thu, Aug 11, 2016 at 2:10 AM, John Mark Vandenberg jayvdb@gmail.com wrote:
---------- Forwarded message ---------- From: Lane Rasberry lane@bluerasberry.com Date: Thu, Aug 11, 2016 at 4:38 AM Subject: [Wikimediaindia-l] seeking help with Hindi projects in Wikisource... To: Wikimedia India Community list wikimediaindia-l@lists.wikimedia.org
Hello,
Can anyone here refer me to someone who is active in making Hindi-language contributions to Wikisource? I wish to meet someone with experience in that language and project. Otherwise, can anyone suggest to me which Indic languages in Wikisource seem to be most active?
I dont know anyone personally, who contributes in Hindi Wikisource, but User:Sfic may be the person you are looking for. Recent contribution history shows his username, so he is active now. But as I said, I dont know him personally.
Is anyone able to make a recommendation for any OCR software for converting scanned Hindi language documents to digital text? Does anyone know anything about in-Wikisource support for OCR in Hindi language? Does it exist? Is there documentation?
Thanks for anything anyone can share.
Yes, I can recommend for this one.
For majority of Indic languages, including Hindi, Google OCR [1] is the only available option till now. We have tested and used it for Sanskrit Wikisource and it gives good result. As both the languages use the same Devanagari script, then it will work for Hindi too.
Obviously, the other best option is to train the Tessaract OCR [2] for Hindi, but it will take time. There is also a trained data [3] existing from Aug 2014. I dont know about its output result.
Also, ABBYY dont support Hindi [4]
[1] https://support.google.com/drive/answer/176692?hl=en [2] https://github.com/tesseract-ocr/tesseract [3] https://github.com/tesseract-ocr/tessdata/blob/master/hin.traineddata [4] https://www.abbyy.com/support/finereader/12/rl/
I hope this helps,
Regards,