Oliver Keyes, 05/09/2015 02:24:
Well, we have the implementation of Kolkus's algorithm in Java - although it's a training-based model so it'll need a known dataset to run off.
Niklas made a dataset for one of the available language detectors, using some millions translatewiki.net documents in hundreds languages: https://github.com/nemobis/LanguageDetector/commit/05040b7ec14b0c261fb6462a1... Cf. http://laxstrom.name/blag/2015/03/09/iwclul-33-conversations-and-ideas/
Nemo