nimforum mirror - mm:sharedOrc

Araq (orginal) [2024-06-16T15:31:32+02:00] view original

"Somebody" should implement this paper:

https://dl.acm.org/doi/pdf/10.1145/3299706.3210569 or maybe this one: https://dl.acm.org/doi/pdf/10.1145/3453483.3454060

FRC is super interesting as it has direct support for Nim's var x {.cursor.}: ref T idea making these memory safe as well 🙂. FRC is patchable with a cycle collector component which would give us mm:sharedOrc. For a concurrent cycle collector the best blueprint can still be found in https://pages.cs.wisc.edu/~cymen/misc/interests/Bacon01Concurrent.pdf

Our team of core developers does not have time to work on any of this, but the algorithms can be implemented in Nim "userland" code via custom hooks on top of Nim's unsafe ptr T construct.

Once a library solution reached maturity, we can port the solution to Nim's core giving us mm:sharedOrc. We should elaborate in the spec that ref can have "delayed" semantics. In other words ref destructors can be scheduled "later". This solves many problems at the same time. Recursive destructors that overflow the stack are effectively prevented, performance improves, and in the real world nobody has a linked list of File objects that need deterministic destruction. Deterministic destruction is awesome for seq which can be large and for File/Channel/Sockets but for ref it's neither necessary nor desirable. YMMV, of course.

elcritch (orginal) [2024-06-16T16:23:16+02:00] view original

Our team of core developers does not have time to work on any of this, but the algorithms can be implemented in Nim "userland" code via custom hooks on top of Nim's unsafe ptr T construct.

By that, you mean the =hooks on an object like shared pointers or just using raw pointers by themselves?

Araq (orginal) [2024-06-16T16:32:31+02:00] view original

=hooks`on an object that uses `ptr as a hidden field. However, not even that is required as in test programs and experiments it is usually easy enough to insert the incRef/decRef operations yourself.

elcritch (orginal) [2024-06-16T17:58:18+02:00] view original

Ok, and true but =hooks are easy to do.

I've been reading the FRC paper. Definitely intriguing.. Although it starts getting pretty complicated. The built-in concurrent read and possibly combining it with {.cursor.} seems awesome. Seems like a good fit for a shared-orc design overall.

elcritch (orginal) [2024-06-16T18:24:41+02:00] view original

I was curious to find a YouTube talk on the algo. No luck, but this video from a Microsoft researcher mentions Nim! https://youtu.be/pYaUT63yIRE?si=avp8DygO_Il7UVS8

Araq (orginal) [2024-06-16T18:25:37+02:00] view original

Here is the implementation: https://github.com/terraindata/frc

didlybom (orginal) [2024-06-16T23:23:53+02:00] view original

Can you give the ELI5 of why yet another mm is needed? Would sharedORC be a replacement or an alternative for ORC?

Araq (orginal) [2024-06-17T05:24:38+02:00] view original

"Needed" is a strong word, it is not "needed" but it fills a void. arc : atomicArc corresponds to orc : sharedOrc.

Practically speaking with sharedOrc we could remove the concept of "isolation" and "gc safety" from Nim. I know you all like "isolation" but I find it lacking and it's hardly used out there, afaict.

Clonk (orginal) [2024-06-17T10:18:21+02:00] view original

I know you all like "isolation" but I find it lacking and it's hardly used out there, afaict.

Could you (or someone else) expand on that - and that'd be a great blog post as well - on what / why isolation is lacking ? What problem does --mm:atomicArc / --mm:sharedOrc solve and what do they costs at run-time ?

Araq (orginal) [2024-06-17T10:55:13+02:00] view original

Imagine a webservice that reads a .cfg file at startup. The configuration is readonly after startup and contains a JsonNode part. Can you send an "isolated" configuration to your threads? No, it's shared. You can send a copy, sure. But that is not as convenient and it fails to work when you have lots of data (graphic resources? large language models?). And even if the data is isolated maybe it contains a ref to a configuration section that you want to share? So it's "partially isolated" where the isolated parts are written to and the non-isolated parts are only read from.

All these scenarios suddenly start to work with mm:atomicArc and are somewhat painful with mm:arc/orc. But with atomicArc you don't know if you introduced cycles, so people request tooling support to find cycles. Fair enough, but with sharedOrc avoiding cycles becomes a performance improvement rather than a memory leak bugfix.

Isofruit (orginal) [2024-06-17T11:05:13+02:00] view original

In addition to what araq wrote the following scenarios:

You get user-input via a GUI widget and want to send that to another thread to do some computation. You are not the code that is generating the data. Therefore, you can not isolate what comes out of the widget, for all you know it is still retaining a reference. That is, unless you copy. At which point, why are you using channels etc. that requires you to isolate when you can just copy in the first place.

Therefore, in all scenarios where data you need to send comes from something you do not control, it won't help you. And those scenarios occur a lot.

Further, even if you control the data, the ergonomics are pretty rough. You can't do a computation with the data first for something unrelated and then isolate for sending across threads (at least I couldn't figure that one out), which also can be quite often a scenario you run into.

Lastly, the ergonomics are rough. Isolate fundamentally that is only something that matters for the technical problem-domain of pushing data-across threads. But given the way you need to use it at the spawn-site of whatever data you want to send across threads, you now have that littered all throughout your codebase and you can't really ecapsulate that in a single module that cares about cross-thread-data-transfer because the isolation type must be everywhere and you need to think about it everywhere that you use that data.

These three reasons were ultimately why my multithreading experiments will just rely on copying. It's just simpler and less of a headache.

didlybom (orginal) [2024-06-17T18:08:15+02:00] view original

Is the plan to eventually replace - -:arc with - -mm:atomicArc (hopefully by renaming “atomicArc” to just “arc”) and, if someone implements sharedOrc, to have it replace regular orc as well?

Araq (orginal) [2024-06-17T20:37:40+02:00] view original

Well it depends on the overhead they introduce and how good the tooling (sanitizers, custom annotations for static tools?) becomes at detecting data races.

Swift, for example started with what is basically mm:atomicArc and introduces isolation too, to prevent data races at compile-time: https://www.swift.org/blog/swift-5.10-released/#data-race-safety-in-swift-510

Mirror of forum.nim-lang.org

11770 :: mm:sharedOrc