nimforum mirror - Tips on how to avoid Nim pointer instability bugs?

JohnLuck (orginal) [2022-10-25T17:00:07+02:00] view original

I am building a compiler in Nim, so iterating over complex tree structures of ref objects. I keep running into "quantum bug" issues where simply adding 'echo "hello"' on a line will tigger or prevent a crash, or crashes that happen nondeterministically. Given the complex nature of the code, it is already hard to debug, but now it is just impossible (can't reduce examples, can't print debug and step debugging doesn't work on mac).

I think this is the 4th time running into these bugs in the last 2 months and I spent 8 hours on it today, without a fix. To be honest it is super depressing and I am considering abandoning the whole project and Nim all together. But I do wonder if there is a set of best practices to avoid this sort of thing?

mratsim (orginal) [2022-10-25T17:22:49+02:00] view original

We have complex tree structures of ref objects in:

our P2P networking stack with Futures flying left and right.

our internal representation of the Ethereum blockchain

our wire format (based on Merkle Trees)

We have thousands of nodes running 24/7 for weeks to months on end in https://github.com/status-im/nimbus-eth2 and we never got that specific heisenbugs.

You'd have to give us more. It sounds like you're accessing uninitialized memory?

exelotl (orginal) [2022-10-25T19:47:54+02:00] view original

Keep in mind that the "unsafe" features of the language are: ptr, addr and cast (plus certain pragmas such as {.noinit.}, {.cursor.}, {.union.}, {.checks:off.}, etc.). And of course, interfacing with C libraries, and other "I know what I'm doing" things such as explicitly calling GC_ref/GC_unref.

If you and your dependencies aren't doing any of these things then you shouldn't be running into such mysterious crashes. You'll have to give more info.

You could try compiling with --mm:orc -d:useMalloc and use Valgrind to search for memory issues (though I don't have experience with doing this myself).

JohnLuck (orginal) [2022-10-25T20:19:22+02:00] view original

I know of one simple pointer instability issue that has not been fixed for years: https://play.nim-lang.org/#ix=4e7m

I am aware of that one so I avoid it. However, in my case I am iterating over ref object graphs and modifying their children, parents or even grandparents, so if the simplest pointer instability issues are ignored, it seems likely that there could be issues with this too.

Moreover, I am not messing with uninitialized memory at all.

I know that these issues do not affect most use cases, I probably just see them a lot because of the things I am working on.

Anyway, having had a moment to clear my head, I think I will ditch ref objects for indexes. It will be a bit of a hassle to refactor and will make the code a bit harder to read, but nim being as flexible as it is I think I might be able to come up with a decent solution.

Araq (orginal) [2022-10-25T20:30:21+02:00] view original

I know of one simple pointer instability issue that has not been fixed for years: https://play.nim-lang.org/#ix=26ap

That link doesn't work for me.

I also don't know pointer instability bugs. Maybe you use addr into a sequence that can grow or shrink?

JohnLuck (orginal) [2022-10-25T20:35:54+02:00] view original

Sorry I just updated the link: https://play.nim-lang.org/#ix=4e7A

Take a look, I think it explains it better than I can with words.

japplegame (orginal) [2022-10-25T20:53:19+02:00] view original

It's definitely not a "pointer instability" issue and is not an issue at all. This is the expected behavior.

You should not work with raw pointers in this manner.

JohnLuck (orginal) [2022-10-25T21:02:46+02:00] view original

The raw pointers are only for showing what happens under the hood. Here is the same example where I removed the raw pointers that confused you: https://play.nim-lang.org/#ix=4e7K

JohnLuck (orginal) [2022-10-26T07:10:36+02:00] view original

Here is another pointer stability bug: https://play.nim-lang.org/#ix=4e9w

I guess there is no easy fix for this type issues in a compiler in general, outside of another indirection or a borrow checker.

japplegame (orginal) [2022-10-26T07:55:14+02:00] view original

> The raw pointers are only for showing what happens under the hood. Oops. I'm sorry, you're right, it's not about raw pointers.

jrfondren (orginal) [2022-10-27T20:47:59+02:00] view original

Rust's borrow checker isn't the issue with this example. Consider:


fn force_realloc(vec: &mut Vec<i32>) -> i32 {
    for i in 0 .. 100_000_000 {
        vec.push(i);
    }
    return 1234;
}
fn main() {
    let mut vec = Vec::new();
    vec.push(0);
    println!("{:?}", unsafe { std::mem::transmute::<&i32, usize>(&vec[0]) });
    vec[0] = force_realloc(&mut vec);
    println!("{:?}", unsafe { std::mem::transmute::<&i32, usize>(&vec[0]) });
    println!("{:?}", vec[0]);
    println!("{:?}", vec.len())
}

vec[0]'s address changes. Rust just doesn't consider it until after the function returns.

japplegame (orginal) [2022-10-27T22:14:07+02:00] view original

I was talking about the second example: https://play.nim-lang.org/#ix=4e9w

sls1005 (orginal) [2022-10-28T05:05:18+02:00] view original

In my opinion, if the use of a pointer is proofed not to be unsafe, it's not a problem, because everyone knows that. But if it's something that should have been safe (like var T), then it's a problem, because everyone use it carelessly, thinking they're on the safe side.

Araq (orginal) [2022-10-28T08:48:14+02:00] view original

But if it's something that should have been safe (like var T), then it's a problem, because everyone use it carelessly, thinking they're on the safe side.

Correct, the rules should patch the memory-safe subset of Nim.

japplegame (orginal) [2022-10-28T11:32:29+02:00] view original

Another example of weird code generating a weird result

type Foo = ref object
  flag: bool

var test = Foo()

proc setFlag(flag: var bool) =
  test = Foo()
  flag = true

setFlag(test.flag)
assert(test.flag, "how could this happen?")

exelotl (orginal) [2022-10-28T13:50:38+02:00] view original

Are there existing issues on github to keep track of these cases? You should definitely create them if not.

DeletedUser (orginal) [2022-10-28T14:40:21+02:00] view original

This gives an assertion error on https://play.nim-lang.org/#ix=4ejt, unless you are using the development version. If you change Foo to object instead of ref object then assertion passes as expected.

If it persists, I'm guessing it might be a C compiler issue. But I'm not sure.

demotomohiro (orginal) [2022-10-28T14:52:21+02:00] view original

Related Github issue: https://github.com/nim-lang/Nim/issues/18683

adrianv (orginal) [2022-10-30T13:49:31+01:00] view original

IMHO that's not on the same level as the original problems. You will get the same result on any language that has the concept of var parameters and you are working with a pointer type. The result is as expected. The problem is a side effect in code flow. Here the same program in pascal.


program Hello;
{$mode Delphi}
{$ASSERTIONS ON}

type
    TTest = record
        flag: Boolean;
    end;
    PTest = ^TTest;

var test: PTest;

function newTest(): PTest;
begin
    new(Result);
    Result^.flag := false;
end;

procedure setFlag(var flag: Boolean);
begin
  test:= newTest();
  flag := true;
end;

begin
  writeln ('Hello World');
  test:= newTest();
  setFlag(test^.flag);
  assert(test^.flag, 'how could this happen?');
end.

Mirror of forum.nim-lang.org

9549 :: Tips on how to avoid Nim pointer instability bugs?