Ooh, excellent! Thanks Nemo!
On 5 September 2015 at 12:33, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Oliver Keyes, 05/09/2015 02:24:
Well, we have the implementation of Kolkus's algorithm in Java - although it's a training-based model so it'll need a known dataset to run off.
Niklas made a dataset for one of the available language detectors, using some millions translatewiki.net documents in hundreds languages: https://github.com/nemobis/LanguageDetector/commit/05040b7ec14b0c261fb6462a1... Cf. http://laxstrom.name/blag/2015/03/09/iwclul-33-conversations-and-ideas/
Nemo
Wikimedia-search mailing list Wikimedia-search@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimedia-search