Hi Sankarshan,
Thank you for prompt inisitiative after talking at Kolkata bookfair.
Bengali wikipedia community ( wiki source,
), are
ready to do a nothing except coding to crack this OCR issues. As all you
know that, this will not only help for us, it will be the most awaited
wishes from longtime.
Regards,
Jayanta
On Monday, February 3, 2014, Sankarshan Mukhopadhyay <
sankarshan.mukhopadhyay(a)gmail.com> wrote:
Hi Rabindra,
Thank you for writing in.
I am replying as a top-post because I have copied in the mailing list
we use to discuss project ideas (subscription interface should be
available from <
http://lists.ankur.org.in/listinfo.cgi/project-ideas-ankur.org.in>
I have also added Jayanta Nath in the list. I met Jayanta yesterday
(after a suitably long period of interactions over email) and, we
ended up chatting about the usual - "how to crack this OCR issue in a
manner that helps the Bengali Wikipedia community and, especially
Wikisource"
I am glad to note that you have taken a look at Abhishek's existing
work. Have you been able to reach out to him and discuss in some level
of detail the current state of the work? The voting piece is somewhat
based on the concept that a larger number of users of the system can
help train the system for higher degree of accuracy.
ankur.org.in will be putting in an application as a mentoring
organization. However, the acceptance in GSoC2014 is always subject to
- [1] good set of project ideas; [2] reasonable success from previous
year etc. So, there is a period of waiting before one gets to know
about being selected as a mentoring organization and, thereafter
begins the process of selecting strong applications from students.
I would recommend that you spend this time catching up with Abhishek
and also Jayanta in order to be able to understand a real-life
utilization of your project (should ankur.org.in be selected and, you
are accepted as a student)
/sankarshan
On Mon, Feb 3, 2014 at 12:56 PM, Rabindra Rakshit <rovir2r(a)gmail.com>
wrote:
I (Rabindra Rakshit), am interested in applying
for GSOC 2014, and would
like to know if Ankur India is applying as a mentoring organization this
year also.
I am currently pursuing my B.tech in Computer Science(CSE) from College
of
Engineering and Management, Kolaghat, and being
born a Bengali, would
love
to see my language flourish in the open source
community.
I am particularly interested in the project about Improving information
retrieval methods for OCR data sets consisting of Indic scripts(Info
Rescue). I had a look on the work plan of Abhishek Gupta, the final
voting
system in a general(abstract) manner is yet to be
implemented.
I don't have any exact experience about OCR, but I do have experience of
working with Information Retrieval Systems, in fact, right now I am
working
on Consensus Sequence Segmentation, an
Unsupervised Text Segmentation
algorithm that relies entirely on statistical relationships among
alphabets
in the input sequence to detect location of word
boundaries. I have
attached
a document of our work which is still in
progress.
Link:
http://arxiv.org/abs/1308.3839
--
sankarshan mukhopadhyay
<https://twitter.com/#!/sankarshan>