Hi all,
What encoding is used for article, metadata, categories data, ... , respectively for the title and url strings in the directory? I could not find documentation on this.
Simplest for handling in zim-viewers would be to define that everything is encoded in UTF-8. This should work with all languages. Other option would be to define the encoding either for the comple zim file (e.g. in metadata), or on-a-per article (html-tag in header). It would make sense to restrict the possible encodings to some small subset, as else reader are not compatible with all zim-files. In case a-per-article encoding is to be supported, it would be necessary to specify the encoding of the directory entires separately. Disadvantages of this approach is the higher complexity for the reader, in particular in the per-article approach. Furthermore the definition is more complex. (for example it needs to be defined what encoding is used if no encoding is specified in an article/metadata.)
I'd prefer to just define everything is UTF-8, but I am not sure whether this has drawbacks I am not aware of. However, I think it is very important that we define something about encoding, because else we cannot support zim files in all languages reliable.
Best regards, Christian