I am processing some small text files which come from a number of sources and most of them do not have any indication of the encoding used.
NotePad in Windows is able to guess the encoding and display the files correctly.
How do I do the same in Nim?
For example, looking at some of the files in hex, some have quotations marks as the character 92.
So I can convert to utf-8 using something like
var encodingConverter = encodings.open("utf-8")
for tmp in txtFileName.lines:
var line = convert(encodingConverter, tmp)
But other files use the sequence E2 80 99 which don't need any conversion and so the above code messes up the result.
One option would be to wrap this library:
https://github.com/Joungkyun/libchardet
Or look at the algorithms it implements and rewrite in Nim. I did not find an existing Nim equivalent from a cursory search on nimble directory.