I am in Chinese windows 10 environment.
When I write Nim code which contains Chinese word,
I must change the code file to ANSI or GB2312 encoding,
Then my program work exactly.
But when my program access my DataBase
which is utf-8 encoding, and contains chinese word
unreadable code appeared.
Is this an issue?
Please help me,Thank you!
Please,think about it for us(Chinese,Japanese,Korean).
When you write a Nim GUI program,
you must change your code file to ANSI encoding,
and you can not read your data by DataBase tools.
°¢빵ٷ�
This is the data I copy from "SQLite Expert"
This data is writen by my Nim GUI program
Again, this is not a problem specific to Nim. It's due to how the Windows console handles encodings, combined with the fact that string literals are encoded in UTF-8. Similar problems arise in Java, C, and C++. The only language that I know of that mitigates this specific problem is Python, and that's because Python's print() procedure automatically tries to encode its input using the console encoding. Even then, there are problems.
Edit: I'll see if I can whip up a uniEcho procedure tomorrow in my free time. It won't be efficient, but it'll do. There are various strategies, but the foolproof ones are complicated.
This is because Windows' console does not support unicode (even with wprintf() of C-runtime). You should use UTF-8 for your source code, and when you have to output text to console, convert it from UTF-8 to your current non-unicode charset, i.e GBK for ur situation.
While the problem still exists, since the ``parseInt`` proc is broken in Nim's module ``parseutils``, and the module ``encodings`` u need here depends on it, so u will be stuck ...
EDIT: Oh, no, the parseInt thing is really a tinycc's problem, I changed the backend to gcc-mingw and it works.I'll give a example here:
import "encodings"
var ce = getCurrentEncoding()
echo "current encoding name: ", ce
var enc = encodings.open(ce, "UTF-8")
echo enc.convert("你好,中文!")
Above code will produce:
d:\test\nim\nim02.exe
current encoding name: gb2312
你好,中文!
d:\test\nim>
Note that, the encodings module does not support charset name GBK, use gb2312 or GB18030 instead if hard-coded.
Nim save user data, which contains Chinese word, into database (windows 10 environment) these words are unreadable in the DB.
Which DB do you use?
#file encoding UTF-8
#nim c -r test.nim
import jester, asyncdispatch, db_mysql
var conn = db_mysql.open("******","******","******","******")
routes:
get "/":
var sql = sql("insert into student (title) values (?)")
conn.exec(sql,"测试中文") #this data in db---------> 测试ä¸æ–‡
sql = sql("select title from student order by id desc limit 0,1")
var str = conn.getValue(sql)
resp str #this data in web page---------> 娴嬭瘯涓枃
runForever()
db_mysql.close(conn)
I will show my GUI test code a moment later.
# nim c -r --app: gui test.nim
# file encoding UTF-8
import iup
discard iup.open(nil, nil)
message("测试一下中文", "中文") #unreadable words
close()
https://dev.mysql.com/doc/refman/5.0/en/charset-applications.html indicates that UTF-8 is not the default.
For your IUP example you need to call storeGlobal("UTF8MODE", "YES") but I couldn't get it to work, most likely because I don't have a Unicode build of iup.dll.
My MySql charset is UTF-8.
storeGlobal("UTF8MODE", "YES") Yes!This is useful!Fix it!Thank you!
@liulun
IUP supports UTF8 since version 3.x.
Plus, u should set another global attribute to handle file names in Windows:
if $iup.getGlobal("DRIVER") == "Win32":
iup.setGlobal("UTF8MODE_FILE", "YES")
@Araq
May I ask that is there any way to compare two cstrings except to convert them into strings? Obviously the == operator won't work, and the cmp either.
@Varriount
You're correct, but it doesn't matter for PO's situation, and CJK charsets do not cover each other, plus, the Windows' console does not support outputing text of different codepages at the same time.
@liulun
it's a bit strange, I don't have any problem with MySQL + Nim handling UTF-8.
I stored many Chinese words into my database.
I don't even use "SET NAMES UTF8" for MySQL.
also I use "latin1" for my MySQL default charset(unbelievable, but it works)
possible suspect: double conversion?
I use innodb/UTF-8, and SET NAMES UTF8.
It is not a problem specific to Nim
Thank you all