2014-02-28 11:09 GMT+02:00 Roman Zaynetdinov romanznet@gmail.com:
From which source gather the data?
Wiktionary is the best candidate, it is an open source and it has a wide database. It also suits for growing your project by adding different languages.
It's not obvious why you have reached this conclusion.
1) There are many Wiktionaries, and they do not all work the same or have the same content. 2) The Wiktionary data is relatively free form text, so it is hard to parse to find the relevant bits. 3) Dozens of people have mined Wiktionary already. It would make sense to see if they have put the resulting database available. 4) There are many sources of data, some of them also open, which can have better coverage, or coverage on speciality areas where Wiktionaries are lacking. 5) I expect that best results will be achieved by using multiple data sources.
Growth opportunities
I am leaving in Finland right now and I don't know Finnish as I should to understand locals, therefore this project can be expanded by adding more languages support for helping people like me reading, learning and understanding texts in foreign languages.
I hope you enjoyed your stay in here. I do not how much Finnish you have learned, but after a while it should be obvious that just searching for the exact string the user clicked or selected will not work because of the agglutinative nature of the language. I advocate for features which work in all languages (at least in many :). If you implement this for English only first, it is likely that you will have to rewrite it to support other languages.
-Niklas