nimforum mirror - Sigils – Improved Multi-Threading

elcritch (orginal) [2024-12-13T09:43:26+01:00] view original

Sigils now has much better threading support! There's also support for std/async. Chronos support would be trivial to add as well.

Note: there's still some rough edges and a couple of known-gotchas. They should be very rare though and there's ways to avoid them.

Below is a complete example of combining async with slots and signals.


import std/[unittest, asyncdispatch, times, strutils]
import sigils
import sigils/threadAsyncs

type
  SomeAction* = ref object of Agent
    value: int
  Counter* = ref object of Agent
    value: int

proc valueChanged*(tp: SomeAction, val: int) {.signal.}
proc updated*(tp: Counter, final: int) {.signal.}

## -------------------------------------------------------- ##
let start = epochTime()

proc ticker(self: Counter) {.async.} =
  ## This simple procedure will echo out "tick" ten times with 100ms between
  ## each tick. We use it to visualise the time between other procedures.
  for i in 1..3:
    await sleepAsync(100)
    echo "tick ",
         i*100, "ms ",
         split($((epochTime() - start)*1000), '.')[0], "ms (real)"
  
  emit self.updated(epochTime().toInt())

proc setValue*(self: Counter, value: int) {.slot.} =
  echo "setValue! ", value, " (th:", getThreadId(), ")"
  if self.value != value:
    self.value = value
  asyncCheck ticker(self)

proc completed*(self: SomeAction, final: int) {.slot.} =
  echo "Action done! final: ", final, " (th:", getThreadId(), ")"
  self.value = final

proc value*(self: Counter): int =
  self.value

suite "threaded agent slots":
  teardown:
    GC_fullCollect()
  
  test "sigil object thread runner":
    var
      a = SomeAction.new()
      b = Counter.new()
    
    echo "thread runner!", " (th:", getThreadId(), ")"
    echo "obj a: ", a.unsafeWeakRef
    
    let thread = newSigilAsyncThread()
    thread.start()
    startLocalThread()
    
    let bp: AgentProxy[Counter] = b.moveToThread(thread)
    connect(a, valueChanged, bp, setValue)
    connect(bp, updated, a, SomeAction.completed())
    
    emit a.valueChanged(314)
    let ct = getCurrentSigilThread()
    ct.poll()

Araq (orginal) [2024-12-13T14:59:10+01:00] view original

Can we draw any conclusions from this what Nim 3 can do better with multi-threading and concurrency? I now have a design that is better than Malebolgia (I think...) but as usual I'm clueless about concurrency.

elcritch (orginal) [2024-12-14T03:23:11+01:00] view original

Can we draw any conclusions from this what Nim 3 can do better with multi-threading and concurrency?

Definitely, there's been some themes which keep coming up that result in rough edge cases. Though I'm not sure how to best phrase them but I'll try and elucidate some thoughts.

One general theme I'm finding is needing some better ways to control "borrows". Both lent and cursor are very helpful in this regard, but the semantics are a bit unclear to me and depend on the Nim version(s). They also don't work for arguments like with var args.

If you could protect a ref or object with ref's you could build something akin to Rust's Arc<RwLock>. It's possible to make that now except that references can be "leaked" too readily. So there's no real way to ensure or enforce something at compile time like this:


proc example(value: var Bar, sharedData: SharedPtr[RwLock[Foo]]) =
   withShared sharedData as foo:
       value.doSomething(foo) # foo can be "captured" resulting in possible errant RC's

The problem is that doSomething can make a copy to Foo if it's a ref object and store it in value or globally. Though, even if the compiler could statically prevent new refs from being made it might preclude a lot of libraries from being used. Still marking proc arguments with a "does not capture ref's".

Hmmm, one thought I just had. This could be done with a run-time isIsolated check which would essentially run an ORC cycle detection on the Foo object and check that it's still self-contained with one unique pointer. I saw something to that effect recently in threading library regarding running an ORC cycle in send IIRC.

I now have a design that is better than Malebolgia (I think...) but as usual I'm clueless about concurrency.

Nice! It'd be cool to see what you have in mind.

I also was thinking recently about the Bacon paper and the other paper you mentioned for threaded cycle collection (I forget the name). That second paper was complicated and required shared refs with local refs, but I had a good thought for how to implement the algorithm in Nim in a more natural way by building on how ARC is designed. At some point I'll have to try and implement it.

elcritch (orginal) [2024-12-17T04:57:19+01:00] view original

BTW, what drove my need for sigils rather than using Malebolgia, taskpools, or even async is the need to handle the kinds of events used in GUIs. Each GUI platform serves user events via platform specific APIs. Unfortunately they don't integrate well with async (poll/select) kernel APIs needed for async interop. That's partly why Figuro went with signal/slots rather than async.

Likewise structured concurrency like in Malebolgia or Taskpools are great for computational parallelism, but don't integrate well with IO or event based concurrency unless you can build the entire program around them.

The signal/slot paradigm actually ended interfacing rather nicely (IMO) with async running on another thread. You can't make an async-slot, but that's intentional. Slots are marked as {.nimcall.} to avoid closure allocations in the UI since closures == allocations == slow UIs or lots of RAM.

Araq (orginal) [2024-12-17T07:30:54+01:00] view original

Slots are marked as {.nimcall.} to avoid closure allocations in the UI since closures == allocations == slow UIs or lots of RAM.

Nah, a UI's speed in 2024 will be good even with the oh so terrible "closure allocations". ;-)

Araq (orginal) [2024-12-17T08:01:49+01:00] view original

Likewise structured concurrency like in Malebolgia or Taskpools are great for computational parallelism, but don't integrate well with IO or event based concurrency unless you can build the entire program around them.

Agreed.

arnetheduck (orginal) [2024-12-17T10:04:16+01:00] view original

don't integrate well with IO or event based concurrency unless you can build the entire program around them.

Typically, both threads and event-based UI loops have thread-safe hooks to make the two paradigms interop, ie with threads you have condition variables (or if you go an abstraction level higher, channels) which can be used to plug into the other systems.

We use that for chronos+taskpools, chronos+golang scheduler, chronos+QT and all other combinations thereof and the end result is typically a small glue layer around which the rest of the two worlds can be built in isolation - of course, this requires applying different mentalities depending on where in the codebase you are, but the interop glue itself is actually pretty trivial .

Another way to put this is that any framework that introduces threads in some shape or form should have these hooks exposed such that it can be integrated with everyone else, and everyone will be happy :)

elcritch (orginal) [2024-12-18T08:57:13+01:00] view original

Nah, a UI's speed in 2024 will be good even with the oh so terrible "closure allocations". ;-)

Heresy!! Plus I want to be able to run it on my embedded devices one day. I'm tired of Linux :P

Though yeah it's probably not really needed, and could be relaxed in the future. Though closures encourage other problems with capturing state in unintended ways.

Araq (orginal) [2024-12-18T09:21:35+01:00] view original

Heresy!! Plus I want to be able to run it on my embedded devices one day. I'm tired of Linux

I know you're not serious but the problem is too many people are about these things. The idea that you need an OS just to get a "heap" but that the stack doesn't is far too widespread to let these things slip through. So once again, there is nothing special about a heap, you practically always have one. In the worst case you can use a global variable to get a heap:



var maHeap: array[16K, byte]; allocatorManageForMeAsHeap(addr maHeap, 16K)

No OS required, the space is setup by the toolchain (linker sections) like it sets up the stack frame for you.

Araq (orginal) [2024-12-18T09:27:22+01:00] view original

Though closures encourage other problems with capturing state in unintended ways.

Yeah, that's a good point.

arnetheduck (orginal) [2024-12-18T09:35:02+01:00] view original

Though closures encourage other problems with capturing state in unintended ways.

I often wish we had c++'s level of control over what gets captured and how, with nothing captured by default.

Araq (orginal) [2024-12-18T09:46:26+01:00] view original

I often wish we had c++'s level of control over what gets captured and how, with nothing captured by default.

Me too but an explicit .nimcall annotation does prevent captures.

elcritch (orginal) [2024-12-18T12:09:39+01:00] view original

I know you're not serious but the problem is too many people are about these things. The idea that you need an OS just to get a "heap" but that the stack doesn't is far too widespread to let these things slip through. So once again, there is nothing special about a heap, you practically always have one. In the worst case you can use a global variable to get a heap:

Too true, I don't mind allocations even in embedded as long as they're controlled. Some sections require static allocation, but those are generally pretty few. Plus statically managing all the memory on a device with 500+ kB of RAM would be terrible.

Figuro still uses heap allocations. Figuro nodes are ref objects after all but they're cached. Generally I just like avoiding allocation's / dealloc's on the happy path as even on modern computers calling into the system allocator quickly adds up. Between mutexes and locks, cache line churn, etc.

Modern machines are fast enough that GUI's with absurd allocations (ahem React) that will still run fast, but it'll just eat up more CPU and battery life.

elcritch (orginal) [2025-01-07T10:29:03+01:00] view original

Version 0.9.0 of Sigils is now out. The multi-threading unit tests now pass with both threadsanitizer and valgrind running a 10k+ thousand iterations!

While connecting two different event systems is pretty easy but managing memory lifetimes can be a real pain for the general case. I had to re-design the threading implementation 3-4 times before I got a good system that has proper locking and balances performance and overhead for normal and threaded agents.

Now the core multi-threading design is now stable and handles many of the gnarly aspects of synchronizing lifetimes and sending between threads. The design with Sigils is to use agent proxies and connections to automatically manage the lifetimes of "remote" objects based on a local proxy agent.

When you send a Sigil agent to a remote thread to do some work it will live as long as the resulting local proxy is alive. This has some limitations but provides a simple way to ensure threaded work continues until you're done with it. It's possible to have objects manually manage their own lifetimes but that's not exposed to the user yet.

There's likely bugs in various areas but I'm pretty happy with the design and that's it's passing threadsanitizer and valgrind. Also there's no way to destroy the worker threads. But who needs that?! ;)

Next I look forward to begin making more Figuro improvements! It's time to begin making the core widgets and making useful apps again.

elcritch (orginal) [2025-01-07T10:43:59+01:00] view original

Another utility in this release is isolateRuntime provided in sigils/isolateutils. It builds on std/isolation but handles the important case of unique refs.

Unique refs can be isolated and sent to another thread if they are unique, meaning their ref count is 1. All of a ref object's child ref objects must also be unique refs as well. This can only be checked at runtime however, which is what isolateRuntime provides. It recursively type checks the given argument.

It helped me find a few cases where the compiler or my code had created extra references (copies).


suite "isolate utils":
  test "isolateRuntime":
    type
      TestRef = ref object
        id: int
    
    var
      d = TestRef(id: 1)
      isoD = isolateRuntime(move d)
    check isoD.extract().id == 1
    
    expect(IsolationError):
      echo "expect error..."
      var
        e = TestRef(id: 2)
        e2 = e
        isoE = isolateRuntime(e)
      check isoE.extract().id == 2

Mirror of forum.nim-lang.org

12624 :: Sigils – Improved Multi-Threading