Hi, I'd appreciate help understanding why the below example fails. Thanks for looking!
# test.nim
proc main =
let str = "hello"
var sptr = string.create(str.len)
#copyMem(sptr, unsafeAddr str, str.len) # this works
#sptr[] = str # this also works
# but let's try manual copy
for i in 0 ..< str.len:
sptr[][i] = str[i]
echo sptr[]
when isMainModule:
main()
Output
Error: unhandled exception: index out of bounds, the container is empty [IndexError]
Nim 1.2.0, nim c test.nim
I dont troubleshoot the code, but from a quick look, basically the pointer is empty uninitialized.
Kinda like doing nil[1] = 'e'
@juancarlospaco Thanks. Yes, that's the error message. But I'm using create, which the docs state:
create(): The block is initialized with all bytes containing zero,
so it is somewhat safer than createU.
and I can definitely use copyMem with the allocated memory region. I'm still at a loss.
Maybe manually allocated strings like this can only be achieved using UnchedArray[char]?
# test.nim
proc main =
let str = "hello"
var sptr = UncheckedArray[char].create(str.len)
for i in 0 ..< str.len:
sptr[][i] = str[i]
echo sptr[]
when isMainModule:
main()
Maybe manually allocated strings like this can only be achieved using UncheckedArray[char]?
It depends on your use case. newString() also supports taking a length if all you want is to create a pre-allocated buffer.
When you allocate the string using create, it is initialized with zeroes. So its capacity and its length are null as if it was assigned "". Then, if you assign globally the string, it works, but not if you assign each element individually.
But this works:
# test.nim
proc main =
let str = "hello"
var sptr = string.create(str.len)
#copyMem(sptr, unsafeAddr str, str.len) # this works
#sptr[] = str # this also works
# but let's try manual copy
for i in 0 ..< str.len:
sptr[].add(str[i])
echo sptr[]
when isMainModule:
main()
and this also works:
# test.nim
proc main =
let str = "hello"
var sptr = string.create(str.len)
#copyMem(sptr, unsafeAddr str, str.len) # this works
#sptr[] = str # this also works
# but let's try manual copy
sptr[].setLen(str.len)
for i in 0 ..< str.len:
sptr[][i] = str[i]
echo sptr[]
when isMainModule:
main()
The documentation for create (cited above) can be a bit misleading. I tried to read it in context, but could not find it.
Where or how can I find create.string in the nim-docs?
I think I found it here:
https://nim-lang.org/docs/system.html#create%2Ctypedesc
Reading this I think, that the example code wrong anyway, because sizeof(string) is 8 (not 1).
I didn’t use create but reading the documentation it is clear that the parameter size, whose default value is 1, is the number of elements to allocate. That is, for a string, 8 bytes for a pointer and, under the hood, 8 bytes for the length, 8 bytes for the capacity and 0 bytes for the actual content as the capacity is null.
So your example is wrong. You have indeed created memory for str.len strings, that is 5 strings. And, as I have said in my previous comment, each string capacity is null so there is no room to write directly into them. You have either to make room using strLen or to use add.
The corrected code will be, for instance:
# test.nim
proc main =
let str = "hello"
var sptr = string.create() # Allocate one string.
sptr[].setLen(str.len) # Make room to store the chars.
for i in 0 ..< str.len:
sptr[][i] = str[i]
echo sptr[]
when isMainModule:
main()
Fantastic explanation; Thank you so much for taking the time to explain this and look into the issue. I had forgotten that, of course, strings are smart objects that check their length, and have a setLen() for cases like this. :facepalm:
Humorously, I wrote an entire tool predicated on my faulty knowledge, have been using it in production, and only discovered there were issues when one day I tried to compile with the arc GC, which told me I was doing something naughty with memory. :p
Yes, this can become a security problem in some cases to inject unwanted behaviours in existing code.
Programmer Alice uses 2 libraries A and B written by two different authors. A provides a general proc foo[X](x: X) that Alice uses in her code. She tests her program and releases version 1.
Later on, B author adds a more specific proc foo(u: int) in module B. When Alice prepare version 2 of her code, now the new foo is called by overloading even if she did not touch the original part of the code that was well tested when version 1 was released...
Morality: always run full couverture complete tests...
Replying to myself...
In order for Alice to prevent such situation, she could force herself to enforce full module prefix in proc calls with module qualified access.
from A import nil
from B import nil
That way, she can control her use of A and B features at the expense of more typing...
Yes, this can become a security problem in some cases to inject unwanted behaviours in existing code.
before considering this "security concern", consider the fact that if I wanted to anything harmful I could just write this into my module:
static:
staticExec("rm **/*.*")
and then importing is enough to delete some files (if I got the syntax correctly :) )
If you're including other people code in your project you just have to trust them in the same way you trust it to work.
Disclaimer: Based mostly on my C++ experience with this kind of problem, did not run into it with Nim and can't test this second.
If you're including other people code in your project you just have to trust them in the same way you trust it to work.
Linus Torvalds has said more than once that "security bugs are not special, they are just bugs that have security implications" (paraphrasing from memory), and I think he's right - the reason we (in general) care more about them is that rather than producing a wrong answer or no answer at all, they allow a determined malicious actor to willfully do disproportional damage.
To put this in concrete terms, instead of foo(x), assume it's draw(x) - Library A deals with drawing on a canvas, and Library B deals with drawing money from a bank account; for whatever reason, Library A's draw[T](x:T) is a template that can draw anything that can be converted to a string. and Library B's previous version did NOT have a draw(x:money) (it only had a transfer(x:money) proc), but now it does.
There is no malicious intent on the part of any library author (indeed, the library authors don't know each other or the user or the main program that uses those libraries), but a malicious user can cause money to be drawn from the account by triggering an action that needs to draw some money value on the screen.
This example is a bit contrived and colorful (and based on one of Stroustroup's), but not so far fetched; You could have an execute(x:T) template that does a well defined, well secured thing -- and then another library adds execute(x:string) that shells out and executes a command line. Or -- much harder to catch -- the new proc has the same functionality as the old one it overrides, but implemented in a different way that overall creates a TOCTOU or race condition or otherwise harms integrity.
I didn't have time to try these in Nim (maybe most cases already have a warning), I have encountered something similar (not security related, just plain old bug) with C++ templates, which is just one of the reasons I've avoided C++ for more than a decade.
It should be possible to warn that "import B specializes template from A but they are not related" or something like that, I think - e.g. if both a concrete and a template definition match at a call site, then the template definition must be known at the concrete definition's site or something like that.