Dear all,
As some of you may already know I have been working on a tool to visualize the etymological tree of words using data extracted from Wiktionary.
This work involved the development of a software to extract information from Wiktionary textual etymology sections using regular expressions and a context free grammar.
A first version of the tool can be tested at
http://tools.wmflabs.org/etytree/etymology/resources/html/index.html
I have also set up a sparql endpoint at http://etytree-virtuoso.wmflabs.org/
I only had six months to work on this project and I am now asking for a renewal. The main aspect I want to improve is the visualization, which currently uses graphs instead of trees (for the nature of the current data trees could not be used). Also the data extraction method needs to be tailored for specific languages that use special structures and are currently incorrectly extracted.
Please leave your feedback in the endorsement section (end of page) of the renewal
https://meta.wikimedia.org/wiki/Grants:IEG/A_graphical_and_interactive_etymo...
as only projects that have enough support from the community will be funded.
Best,
Ester
wiki-research-l@lists.wikimedia.org