I've been using ORC ever since it arrived in the stable release channel, and generally have had few problems, but every once in a while I run into some severe bug that causes a crash, usually via SIGSEGV, and usually related to multithreading. Given this is not the default memory management strategy at the moment, it can't be expected to be as stable as refc, but I've run into enough issues frequently enough to be unsure of it.
Since ORC will become the default memory management strategy in Nim 2.0, I'm wondering how much focus there is on it, and what the consensus is. I know that there are a lot of projects right now that crash under ORC, including Prologue with threads on (I think the same is true for Jester), which already will cause loads of issues when Nim 2.0 is released.
I also find that ORC's cycle collector seems to need some manual help to be called sometimes, and I get better memory usage when manually calling GC_runOrc periodically. In general, I'm pretty clueless as to how it actually behaves, and what I should do to coax it into collecting more reliably. There is very little documentation on its behavior.
This post isn't to complain, but as someone who isn't a compiler engineer, I'd to hear what some other people think about the state of ORC and its readiness for use, especially in the context of multithreading and general stability.
I have found that ORC + multithreading do not mix in the way one might expect imo. The mental model from Java, Go, C# etc is not transferable for those that expect it. Took me a long time to really internalize that for some reason when coming from that background myself.
Possibly you already know all of this but I'll put it here in case others are interested or if maybe some could correct my understanding.
It is not correct or safe to share a ref object between multiple threads. Doing so will result in broken behavior for ARC, even if you manually manage the ref count since it cannot be bulletproof as it is not atomic nor lockable. For ORC there are easy to create crashes since iirc the cycle check list is a threadlocal. Marking ref objects as {.acyclic.} will resolve the ORC threadlocal cycle check list issue but then you still need to be both careful not to create cycles and still cannot share references across threads safely.
Assuming a ref object is only ever used / referenced from one thread and carefully moved across threads, ORC and multithreading can work without issue in my experience. I do not have issues, crashes etc in Mummy in my usage. I was quite careful with how I managed the one or two ref types I have, only using them internally and just manually managing memory for shared resources.
While the care needed to give ref objects in a multithreaded setting is different from other langs, the shared heap is still such a huge improvement over refc that I'm very happy with what is now possible.
Making the refcounting atomic isn't that hard (there are people who use an unofficial atomic ARC mode) but the cycle collector is quite tricky to multi-thread.
I'm more interested though in ensuring that plain old locking works with ref types. It's quite unexplored territory but it makes sense: You leave out the atomic instructions for a single refcount field but you protect it and other RCs at the same time with a plain lock. Single threaded case remains fast and the multi-threaded case has a higher chance of actually being correct.
It is not correct or safe to share a ref object between multiple threads
Isn't that what channels are for ?
Regarding documentation about ORC, there's the official announcement blog post plus this guest entry.
For more details on the technical and practical side of things, you can also refer to this documentation (Disclaimer: I authored it). As of now, it still applies to Nim for the most part.
The key points of interest about ORC are:
As a consequence of the collector removing outgoing edges (i.e. ref locations) through which reference cycles are possible, one needs to watch out with the following:
type Cyclic = object
a: ref Cyclic
...
proc `=destroy`(x: var Cyclic) =
if x.a != nil: # check the type is initialized
# ^^ this won't work if `x` is destroyed by the cycle collector
...
As @guzba mentioned, if one is very careful with ref types and ensures that whole subgraphs are only owned by a single thread at a time, it's possible to use ORC when multi-threading. The threshold is (currently) a non-atomic and unguarded global, however, so performing operations relevant to the cycle collector (copying, sinking, or destroying a ref through which reference cycles are possible) in multiple threads leads to, strictly speaking, undefined behaviour.
It's important to note that there exists a long standing (since 2 years, at least) bug with the Nim compiler's cyclic type detection logic, that causes all compound and seq types not explicitly marked with acyclic being treated as cyclic. In other words, types like ref array[1, int], ref (int, int), and ref seq[int] are currently all considered relevant to the cycle collector.
Finally, since the automatic reference counting used for ARC and ORC is build upon the lifetime-tracking-hook mechanism, it is also affected by bugs and issues with the latter.
It's important to note that there exists a long standing (since 2 years, at least) bug with the Nim compiler's cyclic type detection logic, that causes all compound and seq types not explicitly marked with acyclic being treated as cyclic.
That's news to me. I mean, I've seen a recent bug report about it and I'm working on it but it's not 2 years old. :-)
The threshold is (currently) a non-atomic and unguarded global ...
Good catch. This should be thread local.
That's news to me. I mean, I've seen a recent bug report about it and I'm working on it but it's not 2 years old. :-)
The problematic lines, 395-397 in compiler/types.nim, were introduced by this PR, which was merged on May 12th 2021 (which was almost two years ago).
So, with that out of the way, the way we approach production readiness is:
So, what can help this process, from the point of view of someone that has a lot of Nim code and coders around?
Fortunately, many of these things are well underway, in particular on the tooling front - if I have a wish for 2.2, it's that it happens 3 months after 2.0 - there exists a trivial way to get there: make time-based releases instead of feature-based releases - if a feature isn't ready for the release, it gets cut out from it and that's the end of the story, with no compromises: there will be another release on a predictable date so the feature gets another chance soon without having to compromise on quality just to ship it.
It's a bit later, but it does seem like the stdlib's async library will need some TLC to work well with ORC. Future's cause all sorts of chaotic cycles. Ideally the core async library will be ARC compatible itself, which would make it more deterministic.
Also, using ORC on embedded is tricky since the cycle collector doesn't run often enough. There used to be a "debug mode" which ran the cycle collector more often, but I couldn't find it the last time I looked. That may be an important property to tune in the future as well.
Atomic refcounting works, however, because the access to heap memory is protected by atomic operations even when the ref variable goes out of the scope, and thus prevents races.
It's very important to note that only protects against races in the atomic count, not data races in general.
In my opinion, relying on just atomic-ref's is an easy way to lure developers into a false sense of security like is seen in Golang. Essentially you have one of: read only data (atomic sharedptr wrappers), movable data with single ownership (Isolate[T]), or you need some sort of locking mechanism around the data in "atomic chunks" (locks).
On that topic, @arnetheduck is Chronos non-cyclic / ARC capable now? I recall reading that it was, but would be curious if there have been updates in that world. Especially it looks looks like there was some big refactoring in Chronos recently.
chronos has been acyclic for a good while for memory efficiency reasons - ie even with refc, it is a lot better to avoid cycles... as to arc, we don't test that specifically and I don't think we ever will.
I imagine that in some future when orc is (more) stable we'll start using / supporting that, but ARC looks like a niche compromise that will never work quite well because developers will keep shooting themselves in the foot with it and then complain it hurts.
Thanks! To be clear I don’t think ARC should be officially supported. Rather as you mention avoiding cycles is more efficient.
Though I do plan to try and use Chronos with ARC on embedded someday. That’s a niche case though and requires devs to design carefully.
Hmmm, it possibly could be useful in audio or robotics where real-time networking matters.
It's a bit later, but it does seem like the stdlib's async library will need some TLC to work well with ORC. Future's cause all sorts of chaotic cycles. Ideally the core async library will be ARC compatible itself, which would make it more deterministic.
I imagine in the coming months that there will be quite a reckoning when it comes to projects depending on asyncdispatch. I haven't tested them on 2.0.0 yet, but last time I checked, things running on Jester and Prologue crash under ORC. We shall see. I'm hopeful that the bugs get ironed out quick now that it's the default, though.
@Araq I think Jester has fixed this issue since last time I checked on it, which is good. Prologue still has issues with ORC and threads, and they all still have leaking issues with asyncdispatch (asyncnet?), but I think things are looking better than they were before this release.
Jester doesn't leak because it avoids using asyncdispatch at all if it can (and any allocation that it can avoid for that matter). I'll withhold further judgements until I get the chance to re-test things on Nim 2.0.0, since it looks like things may be more stable than they used to be.