As described in https://blog.rust-lang.org/2021/11/01/cve-2021-42574.html
let access_level = "user"
if access_level != "user# check if admin ":
echo "you are admin"
else:
echo "you are peon"
Copy-paste the above code in a unicode-conformant editor, and run it. Chances are that it will tell you that you are an admin.
I tested visual code and gedit to be affected, vim at least shows some garbage.
My Neovim displayed that code like if access_level != "user<202e> # check if admin ":.
But my Neovim doesn't show any garbage charactor in this code:
proc runBadCode() =
when defined(windows):
echo "format c:"
else:
echo "sudo dd of=/dev/sda1 if=/dev/urandom"
proc echo (x: varargs[string, `$`]) =
echo x
runBadCode()
proc еchо(x: varargs[string, `$`]) =
# https://en.wikipedia.org/wiki/Cyrillic_script
echo x
runBadCode()
proc r(x: static string): string =
# https://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms
runBadCode()
x
proc ‐(x: int): int =
runBadCode()
-x
proc n(x: int): int = 7 + x
echo "Hello"
echo "Initializing ..."
еchо("Loading ...")
echo "Show windows path: ", r"c:\Program Files", ", ", r"c:\windows\system"
echo n ‐ 1
The more I look at this issue, the more I understand why it can't be solved at the language level (other than disallowing non-ascii characters). The levels of obfuscation you can reach with unicode identifiers are just silly.
https://github.com/codebox/homoglyph/blob/master/raw_data/chars.txt
Of course the type system can ensure this, tainted string was sort of in the same vein. You have a distinct string that you use when you dont trust the source you got a string from, then filter it when you need to do operations, a super brief(probably dumbly implemented version) is as follows:
import std/[unicode, strformat]
type
AsciiStr = distinct string
UnicodeStr = distinct string
SomeString = string or ASciiStr or UniCodeStr
proc `$`(s: AsciiStr): string {.borrow.}
proc `$`(s: UnicodeStr): string {.borrow.}
proc isAscii(s: string): bool =
result = true
for x in runes(s):
if x.size != 1:
result = false
break
proc aStr(s: string): AsciiStr =
assert s.isAscii, fmt"String contains non ascii characters."
result = AsciiStr(s)
const admin = astr"admin"
static: echo admin
const user = astr"user"