Dear all,
As some of you may already know I have been working on a tool to
visualize the etymological tree of words using data extracted from
the English Wiktionary.
This work involved the development of a software to extract
information from Wiktionary textual etymology sections (and other
sections as well) using advanced natural language processing
techniques. The data used by the software is synchronized with the
latest release of the English Wiktionary dump.
A first version of the tool can be tested at
http://tools.wmflabs.org/etytree/etymology/resources/html/index.html
I only had six months to work on this project and I am now asking for
a renewal. The main aspect I want to improve is the visualization,
which currently uses graphs instead of trees (for the nature of the
current data trees could not be used). Also more work is needed to
refine the extraction method as still some relationships are
incorrectly extracted. For more details on the proposed improvements
see the page linked above.
This project is a first step towards the creation of a big database of
etymological relationships that can be used by linguists and etymology
enthusiasts in many ways.
As only projects that have enough support from the community will be
funded, please leave your feedback in the endorsement section (end of
page) of the grant renewal:
https://meta.wikimedia.org/wiki/Grants:IEG/A_graphical_and_interactive_etym…
I am looking forward to your feedback on the grant page.
Thanks a lot!
Best,
Ester