Re: [Wikitech-l] GSOC 2012 - Text Processing and Data Mining (John Erling Blad) - Wikitech-l

31 Mar 2012

      Thank you very much for your feedback Jeblad.
I will immediately look into how this can be best implemented by extending
the Mediawiki API.
Do kindly let me know about my other ideas so that I can shape my proposal
well.
The mentor for ideas I am interested in is Oren Bochman. But I couldn't
track him on the irc.
I would love to interact with him or any other mentor and discuss my ideas
in detail.
I am recahable at
Email      : karthikprasad008@gmail.com
SkypeID  : prasadkarthik
Facebook: facebook.com/prasadkarthik
Google+  : gplus.to/karthikprasad
twitter      : twitter.com/_karthikprasad
Date: Sat, 31 Mar 2012 12:05:00 +0200
...
From: John Erling Blad jeblad@gmail.com
To: Wikimedia developers wikitech-l@lists.wikimedia.org
Subject: Re: [Wikitech-l] GSOC 2012 - Text Processing and Data Mining
Message-ID:
       <CAJcMX2=Pm-fCm4Dg33uwfcMYhy1RJ4HTE-gPD2mJBzuGzcd7wQ@mail.gmail.com
...
Content-Type: text/plain; charset=windows-1252
Your point (a) "Implementing a wikiSumarizer widget which will give the
summary of the page being read by the user" could be extremely usefull for
a hover/ helpbubbles functionality where bubbles with a small explanations
are created within external articles. Such functionality imply creating an
extension to the Mediawiki API.
Jeblad
On Sat, Mar 31, 2012 at 11:09 AM, karthik prasad <
karthikprasad008@gmail.com
...
wrote:
...
Hello,
I am Karthik from India - currently pursuing 3rd year Bachelors in
Computer
...
Science and Engineering in PESIT, Bangalore.
I am interested in some of the projects proposed for Google SOC 2012 and
would love to work and contribute the same to the open-source world.
I am very attracted towards Text Processing and Data Mining. I have
undertaken course in Natural Language Processing. I am currently working
on
...
a project "Automatic Essay Grader" - A system that automatically grades
English essays based on Spelling, Grammar and Structure, Coherence,
Frequent phrases and Vocabulary as weighted parameters. Realized by
implementing a self-designed algorithm ? studying the ?relation graph? of
words of the essay.
I had also worked on "Sentiment Analysis on Web" - Extraction of reviews
about a gadget from tech-review forums, analysis of the Sentiments of the
reviews thus predicting the sentiment/opinion associated with that gadget
and then generation of appropriate Rating on the scale of 10.
The following projects mentioned on the mediawiki's ideas page caught my
eye:

Wikipedia Corpus Tools
Lucene Lemma Analyzers based on Morphology Extraction from Wikipedia

Text
3) Lucene Automatic Query Expansion from Wikipedia Text
4) Translation spellchecking
Apart from the above projects, I also had the following ideas which i
feel
...
will be of great help if implemented.
a) Implementing a wikiSumarizer widget which will give the summary of the
page being read by the user.
b) An automatic coherence analyser which would make it easy to find out
if
...
the article on a given page talks about the same topic
c) Details Aggregator for page.
I would be grateful if you could kindly let me know about the specific
requirements of the projects and about your thoughts on my ideas so that
I
...
can suitably write a proposal.
Eagerly waiting for your response.
Thanking you.
Best Regards,
Karthik.
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l