Hi all!
So - I always wondered if I understood the memory management in nim correctly. I had some surprising moments in the past and I was never sure how to access variables, when using threads. And I probably wrote a lot of bad code or used unnecessary structures. The documentation said that Nim threads had its own heap but I had usually no idea how to handle each heap or how to create workarounds. The new memory model "arc/orc" also didn't simplify it, as it just documented to work different, faster with a shared heap, but trying to use arc/orc sometimes resulted in strange crashes that I didn't had before... So I created some sample code - just to try out - and I wanted to share it with you.
Just some early observations - correct me if I'm wrong:
And here is the sample code
import std/os
import std/strformat
type
TestObj* = object
name*: string
proc getAddr(p: TestObj | pointer): string =
let intVal = cast[uint64](p.unsafeaddr)
return $intVal # 0x & $(intVal.toHex()) causes early crash
proc getAddr(p: ref TestObj | ptr TestObj ): string =
let intVal = cast[uint64](p[].unsafeaddr)
return $intVal
proc `=destroy`*(x: var TestObj) =
echo fmt"destroying '{x.name}' {getAddr(x.unsafeaddr)} ({getAddr(x)})"
proc newObj(name: string): ref TestObj =
new result
result.name = name
echo fmt"creating new obj '{result.name}' {getAddr(result)} ({getAddr(result[])})"
when compileOption("threads"):
proc threadFunc[T](someObj: T) =
echo fmt"starting thread with '{someObj.name}' "
sleep(1_000)
echo fmt"Accessing '{someObj.name}' {getAddr(someObj)} from thread"
proc main() =
echo "x will not be deleted and can be accessed safely as pointer"
echo "y is a ref and could be traced"
# echo "z is referenced as ptr and access is therefore dangerous"
echo ""
let x = newObj("x")
GC_ref(x) # avoid deletion of x
let y = newObj("y")
# let z = newObj("z")
when compileOption("threads"):
var thread1: Thread[ptr TestObj]
var thread2: Thread[ref TestObj]
var thread3: Thread[ptr TestObj]
createThread(thread1, threadFunc[ptr TestObj], x[].addr)
createThread(thread2, threadFunc[ref TestObj], y)
# createThread(thread3, threadFunc[ptr TestObj], z[].addr)
sleep(200)
echo "end of main"
# GC_unref(x)
when isMainModule:
main()
echo "all scope objects destroyed"
GC_fullCollect()
sleep(3_000) # wait for threads to finish - not using thread join
output of nim c -r -d:release --threads:on --gc:orc .\gcref.nim (causes access violation on linux):
x will not be deleted and can be accessed safely as pointer
y is a ref and could be traced
creating new obj 'x' 10485840 (6552720)
creating new obj 'y' 10485872 (6552720)
starting thread with 'x'
starting thread with 'y'
end of main
destroying 'y' 10485872 (6552736)
all scope objects destroyed
Accessing 'x' 10485840 from thread
Accessing '' 10485872 from thread
output of nim c -r -d:release --threads:on .\gcref.nim (no access violation, as referenced object is different):
x will not be deleted and can be accessed safely as pointer
y is a ref and could be traced
creating new obj 'x' 10481744 (6552464)
creating new obj 'y' 10481776 (6552464)
starting thread with 'x'
starting thread with 'y'
end of main
all scope objects destroyed
destroying 'y' 10481776 (6552656)
Accessing 'y' 17367120 from thread
Accessing 'x' 10481744 from thread
The rule is the same as other multithreaded languages without a multithreading-aware GC:
_Ensure the lifetime of whatever you access
From this you can derive a couple rules:
Now the old and the new memory management (deferred refcounting vs arc - automatic refcounting) does not change those accessing principles but significantly improve sharing.
In many cases it is significantly more efficient, maintainable and debuggable to have an unique owner of an object or data and dispatch all transformation steps of this object to several procs or services. Those may or may not be on separate threads. This is called a producer/consumer architecture (or actor model in some case or microservice if done at whole machine scale for some reason) The main advantages are:
However, there is no more notion of ancestor thread that can wait for its child thread to stop processing.
What's the issue? you need to pass the data from one thread to the other. Actually you don't have to do this, you can be much faster by passing "ownership" so the pointer (handle) to the data. Still if behind there is some GC-ed memory, it needs to be collected once done and no ancestor thread can be relied on. Hence in the old memory management model:
This is not the case anymore with ARC (or Boehm or if the data doesn't use GC-ed memory)
there is a guarantee that the multithreaded section ends before ref object is collected by the ancestor thread.
I assumed that too, but this doesn't seem to be correct when using arc/orc. The thread isn't preventing the ref from beeing collected. My application crashes when I use arc/orc. Currently, it seems that I need to manually prevent the ref from beeing collected.
also passes the ownership somehow to the new thread, so it doesn't get deleted if the thread survives.
Use channels, not pointers
The thread isn't preventing the ref from beeing collected. My application crashes when I use arc/orc.
Use channels + ref object or atomic refcounting not ARC.
Use channels, not pointers
This isn't about feeding incoming data to threads. I agree that new data that is relevant for mutlithreading should be passed via channels. But this is for example about general cfg that is read from a file before any thread starts and which doesn't change - or actually the channel itself as ptr. Even the doc recommends to pass Channels as ptr to share them between threads. Channels cannot be passed between threads. Use globals or pass them by ptr
Use channels + ref object or atomic refcounting not ARC
So - you are telling me to not use arc when using threads? I thought that arc could become the next default gc.
Would it make sense to create some ref counted datatype that can be shared more easily in threads, similar to shared_ptr in c++? Not sure if this is a good idea, as even shared_ptr in c++ have their traps that require to use atomic_shared_ptrs in some situations. Just thinking out loud.
But this is for example about general cfg that is read from a file before any thread starts and which doesn't change
In that case it is indeed fine, the address is a global which is guaranteed to survive until the whole program exits.
So - you are telling me to not use arc when using threads? I thought that arc could become the next default gc.
No, ARC (ref object) solves sharing memory so that it can be collected by any thread. When you have shared ownership, you need an atomic reference counter or guarantees by design that the object won't be collected (ancestor threads are the owners and responsible for creation deletion) or fancier memory management techniques like hazard pointers or epoch-based reclamation or quiescent-based reclamation.
ARC (ref object) is still correct for any object that does not have joint ownership between 2 or more threads.
Would it make sense to create some ref counted datatype that can be shared more easily in threads, similar to shared_ptr in c++? Not sure if this is a good idea, as even shared_ptr in c++ have their traps that require to use atomic_shared_ptrs in some situations. Just thinking out loud.
C++ Shared pointers have the same issue as ARC, they are not threadsafe and only usable when at any point in time only a single thread can access and mutate those objects which is guaranteed if using Communicating Sequential Processes architecture (aka Producer-Consumer or Channel-based architecture).
A threadsafe shared smartpointer is mentioned in my very first reply