Re: [Wikimedia-search] Analysis of ElasticSearch language detection plugin against enwiki zero-results queries

5 Sep 2015


      Oliver Keyes, 05/09/2015 02:24:
...
Well, we have the implementation of Kolkus's algorithm in Java -
although it's a training-based model so it'll need a known dataset to
run off.
Niklas made a dataset for one of the available language detectors, using 
some millions translatewiki.net documents in hundreds languages: 
https://github.com/nemobis/LanguageDetector/commit/05040b7ec14b0c261fb6462a1...
Cf. http://laxstrom.name/blag/2015/03/09/iwclul-33-conversations-and-ideas/
Nemo

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Wikimedia-search] Analysis of ElasticSearch language detection plugin against enwiki zero-results queries