Hello,
I'm a computer science researcher in the university of Avignon, in France. I recently developed a software that automatically and quickly extract from an UTF-8 text all the (longest) terms that belongs to a large set of terms. The term extractor works as a server and I tested it successfully with a thesaurus made of the page's titles of fr.wikipedia.org, en.wikipedia.org and es.wikipedia.org, i.e. 9,387,079 distinct terms composed from 4,496,195 distinct words. You are invited to test my demonstration at : http://dev.termwatch.es/~jourlin/demo.php The source code can be found at Github (condition of use, redistribution, modification under the terms of the GNU Public License V3): https://github.com/jourlin/FELTS
I roughly guessed that it could be of some interest for the development of Mediawiki but I would very much appreciate any feedback before I look further into that question.
Best regards,
Pierre Jourlin.
You are invited to test my demonstration at : http://dev.termwatch.es/~jourlin/demo.php
Sorry for this omission : id: master pw: sitnosh
Hi Pierre,
You are invited to test my demonstration at : http://dev.termwatch.es/~jourlin/demo.php
I get this message:
A username and password are being requested by http://dev.termwatch.es. The site says: "Par invitation seuleument / By invitation only / Solo con autentificacion"
Lars Aronsson <lars <at> aronsson.se> writes:
Hi Pierre,
You are invited to test my demonstration at : http://dev.termwatch.es/~jourlin/demo.php
I get this message:
A username and password are being requested by http://dev.termwatch.es. The site says: "Par invitation seuleument / By invitation only / Solo con autentificacion"
Sorry about that. Use master as the username and sitnosh as the password. Thanks for your time !
wikitech-l@lists.wikimedia.org