How did you determine the list?
Just an idea: Wouldn't zimwriter be the place to automatically generate such a list during indexing or in a separate pass? An additional parameter would specify the maximum size of that list, the minimum number of occurrences of a word or maybe a minimum percentage of articles the word occurres in, before it gets marked as trivial.
I formally had the word list in a database. From there I just counted the number of occurencies of words and sorted that list. This is a simple sql statement. Then I viewes that list manually and decided, which words to skip.
It would be possible for sure to automatically determine the list. It will just take additional significant processing power. It may be worth trying. It could be extracted quite easily from the resulting zim index file.
I like that idea. I will try that.
Tommi