Dear all,
As some of you may already know I have been working on a tool to visualize the etymological tree of words using data extracted from the English Wiktionary.
This work involved the development of a software to extract information from Wiktionary textual etymology sections (and other sections as well) using advanced natural language processing techniques. The data used by the software is synchronized with the latest release of the English Wiktionary dump.
A first version of the tool can be tested at
http://tools.wmflabs.org/etytree/etymology/resources/html/index.html
I only had six months to work on this project and I am now asking for a renewal. The main aspect I want to improve is the visualization, which currently uses graphs instead of trees (for the nature of the current data trees could not be used). Also more work is needed to refine the extraction method as still some relationships are incorrectly extracted. For more details on the proposed improvements see the page linked above.
This project is a first step towards the creation of a big database of etymological relationships that can be used by linguists and etymology enthusiasts in many ways.
As only projects that have enough support from the community will be funded, please leave your feedback in the endorsement section (end of page) of the grant renewal:
https://meta.wikimedia.org/wiki/Grants:IEG/A_graphical_and_interactive_etymo...
I am looking forward to your feedback on the grant page. Thanks a lot!
Best,
Ester