Nim has a new garbage collector called Orc (enabled with --gc:orc). It’s a reference counting mechanism with cycle direction. Most important feature of --gc:orc is much better support for threads by sharing the heap between them.
Now you can just pass deeply nested ref objects between threads and it all works. My threading needs are pretty pedestrian. I basically have a work queue with several work threads and I need work done. I need to pass large nested objects to the workers and the workers produce large nested data back. The old way to do that is with channels, but channels copy their data. Copying data can actually be better and faster with “share nothing” concurrency. But it’s really bad for my use case of passing around large nested structures. Another way was to use pointers but then I was basically writing C with manual allocations and deallocations not nim! This is why the new --gc:orc works so much better for me.
You still need to use and understand locks. But it’s not that bad. I just use two locks for input queue and output queue. They try to acquire and release - hold the locks - for as little as possible. No thread holds more than 1 lock at a time.
See my threaded work example here: https://gist.github.com/treeform/3e8c3be53b2999d709dadc2bc2b4e097 (Feedback on how to make it better welcome.)
Before creating objects and passing them between threads was a big issue. Default garbage collector (--gc:refc) gives each thread its own heap. With the old model objects allocated on one thread had to be deallocated on the same thread. This restriction is gone now!
Another big difference is that it’s more deterministic and supports distructors. Compilers can also infer where the frees will happen and optimize many allocations and deallocations with move semantics (similar to Rust). Sadly it can’t optimize all of them a way that is why reference counting exists. Also the cycle detector will try to find garbage cycles and free them as well.
This means I do not have to change the way I write code. I don’t have to mark my code in any special way and I don’t really have to worry about cycles. The new Orc GC is simply better.
This makes the new garbage collector--gc:orc a joy to use.
(If there are any factual errors about the GC let me know.)
Cool! I'm about to embark on multithreading, now that I've gotten my async networking code working on a single thread. Trying to switch over to gc:orc but having a few problems.
Now you can just pass deeply nested ref objects between threads and it all works.
Is it really that simple? Because as @araq has stated, ARC's retain/release are not atomic. That implies to me that a ref object can never be used concurrently on multiple threads.
So I think by "pass" you mean "move" — the way you've described your code, it sounds like the work queues need to use move semantics, so the "push" operation takes an object as a sink parameter. Is that accurate?
And furthermore, the object you're moving to another thread can't refer to any other objects that have references on the current thread ... this sounds like something that could accidentally lead to difficult-to-discover race conditions if one is not careful!
That is correct but we'll have an isIsolated runtime check for that. There is also a plan for ensuring this at compile-time via an islolated: block. Seems entirely within reach thanks to Nim's effect system.
The arc is definitely here for to stay, but the whole story hasn't been told yet.
That is correct but we'll have an isIsolated runtime check for that. There is also a plan for ensuring this at compile-time via an islolated: block. Seems entirely within reach thanks to Nim's effect system.
Will isIsolated support backpointers? A strict iso object does only contain iso pointers ("owned" references). But then, only very limited objects can be built. E.g. a double linked list would fail ( a specific problem/flaw within rust). A DAG would be impossible as well. However, with backpointers (an object can only have one backptr, a backptr is "unique" by definition) it would be possible to reach any part of a list or a tree, up to the root.
A strict iso object doesn't require refcounting (There is nothing to count...) .
That said, I am still memorizing about pony's capabilities and their "viewpoint adaptation". IMHO, they have to introduce a isv capability, an iso-visitor, a capability that overcomes the limitations of iso. If an iso gets sent to another thread, the companion isv gets destroyed. This basically unlinks any connection of the iso with the sending thread. The receiving thread has to (re)build its own isv, if needed. To make this safe, the thread-ID should be part of the iso. An isv establishing a new link to the iso checks the "foreign" ID and sets the new thread-ID. After that, the iso can be used in any way in the new owning thread.
iso objects can be created, send, and destructed easily, In particular, this might be interesting for time-critical applications with extremely limited ressources, e.g. on the microcontroller level.
Yes I "move" the ownership of objects from the main thread to the work threads. And I "move" the ownership back from work thread to the main thread. Once and objects moves a way from the thread it was created on I will not be touching the object or its internal sub objects on that thread. Nim does not have any notion of ownership so I have to do that manually.
I don't expect that both threads reading/writing to objects without locking to work.
Does that make sense? Please check out the code.
Will the new islolated: block help with making sure I don't touch objects or sub objects without locking them first?
How will the iso property be checked dynamically? Well, the object's refcounts will be checked, they all have to be zero (or one, depending of how we count). Now, the owned ref comes into play.
Nah. I mean, you could do it this way, but there is a much better way: You traverse the subgraph. In doing so you could the edges (= E) and sum the RC fields (= S). A graph is sendable to a different thread if and only if S = E + 1.
You traverse the subgraph. In doing so you count the edges (= E) and sum the RC fields (= S). A graph is isolated (sendable to a different thread) if and only if S = E + 1.
Clever! But what if there's an orphaned object cycle elsewhere that has a reference to an object in the subgraph? I've read that cycles are only cleaned up once in a while, so there's a time window where a dead cycle could still exist at the same time that I'm trying to make a cross-process call.
I suppose when you detect a non-isolated subgraph, you could first force a cycle collection and then retry the isolation check, to see if it was a false positive. Is that the plan?
Is that the plan?
Almost. We need to run the cycle collector before a send already to ensure the thread local "cycle candidates" list is empty (note that it is always empty after a cycle collection), otherwise it would interfere with multi-threading. Alternatively we can make the list global and protected via a lock (or implement it as lockfree queue...).
There are many other options, we can also restrict the sending to .acyclic types.
Or we request that "orphaned objects" that will be misdetected as false external roots to be cleaned up manually. That is a good idea anyhow, ensuring the programmer he still aware of the typology:
proc process(x: Node) =
use(x.left)
# likely invalid:
spawn process(x.right)
# better: extract it
spawn process(move x.right)