Hello all, starting to learn Nim, apreciating it so far. I wonder if the problem reported here is the reason why my code is not working. My native language is Brazilian Portuguese, and I'm trying to convert a string from UTF-8 to the 850 codepage for displaying inside the Windows command prompt.
import encodings
var test = convert("Testando acentuação", "850", "UTF-8")
echo test
The string appears changed, but continues displaying incorrectly. Is the previous version 0.14 also affected?Is the previous version 0.14 also affected?
No, and the upcoming 0.15.2 is not affected either. :-)
OK, is 0.15.2 just some days around? If so, I'm gonna wait.
Many thanks.
My native code page is 1251 (Cyrillic). When console code page is set to 1251 (setConsoleOutputCP(1251)) output to console work just fine, but when print Cyrillic text with code page 65001 first two letter are broken. I made little investigation and find that write(f: File, c: cstring) works fine, but write(f: File, s: string) prints broken output. It is very interesting that cstring version uses c_fputc, but string version uses c_fwrite.
Little workaround: in file <NIMFOLDER>libsystemsysio.nim I changed proc write(f: File, s: string) (line 77):
proc write(f: File, s: string) = discard c_fputs(cstring(s), f)
# if writeBuffer(f, cstring(s), s.len) != s.len:
# raiseEIO("cannot write string to file")
Voila! all output now is perfect.
I like idea of default UTF-8 code page, frankly all nim strings are UTF-8... can we just throw away that fwrite?
frwite works perfect in local codepages (non 65001). fwrite works perfect in files. fwrite outputs incorrect only when non ASCII symbols are outputed to stdout.
Example:
proc getConsoleOutputCP(): cint {. importc: "GetConsoleOutputCP", stdcall, dynlib: "kernel32" .}
proc setConsoleOutputCP(codepage: cint): cint {. stdcall, dynlib: "kernel32", importc: "SetConsoleOutputCP", discardable .}
let originalOutCP = getConsoleOutputCP()
setConsoleOutputCP(65001)
echo "Hello, ", "world! ", "Best regards from Nim!"
echo "Здравей, ", "свят! ", "Поздравява те Nim" # same in buglarian
echo "Hallo Welt, liebe Gr", "üße von Nim" # same in German
echo ""
write(stdout, "Hello, ", "world! ", "Best regards from Nim!\n")
write(stdout, "Здравей, ", "свят! ", "Поздравява те Nim\n") # same in buglarian
write(stdout, "Hallo Welt, liebe Gr", "üße von Nim\n") # same in German
setConsoleOutputCP(originalOutCP)
Output: (Nim 0.15.2)
Hello, world! Best regards from Nim!
Здравей, свят! Поздравява те Nim
Hallo Welt, liebe Grüße von Nim
Hello, world! Best regards from Nim!
��дравей, ��вят! ��оздравява те Nim
Hallo Welt, liebe Gr��ße von Nim
Personally I dislike that particular behavior. The c runtime that ships with Windows is meant for internal use.
https://blogs.msdn.microsoft.com/oldnewthing/20140411-00/?p=1273