There is a new paper out about "Using the Tsetlin Machine to Learn Human - Interpretable Rules for High - Accuracy Text Categorization with Medical Applications" [1] or in our context "…High - Accuracy Text Categorization of unsourced statements".

Their results on text categorization is quite promising. I've been wondering why they get so good results, and I suspects it either has to do with implicit regularization (kind of dropouts), or some other effects I suspect can be important when you start comparing really good results. One is that learning binary weights uses less information entropy than learning weights with higher quantization (more bits), thus with limited training data more of the available information entropy goes into learning the actual rules. Another possibility is that the learning algorithm finds better minimums (actually maximums) than the other algorithms. Ie the algorithm find stable solutions, that is the real minimum. A third possibility is that the learning is faster because it does not backprop (thus more stable and converge faster).

The generated rules are much easier to handle in the users own browser, and instead of using a central server the text categorization (classification) can be done in the users own browser. That will make the interaction more responsive.

In my opinion this is a neural network. The generated rules can be reformulated as a disjunctiove normal forms, and then it is more obvious. There are binary weights, weight multiplication done with and-operators, and summation done by or-operators.

There are more background in the paper "The Tsetlin Machine - A Game Theoretic Bandit Driven Approach to Optimal Pattern Recognition with Propositional Logic"

[2] https://arxiv.org/abs/1804.01508

John Erling Blad
/jeblad