Hello Araq and compiler devs, I am making some short of emulator for gpu compute shaders and need to approximate threads in a subgroup that run in lockstep. I figured that by transforming them into closure iterators and inserting yields after control flow (for/ifs) I can make them reconverge. With that way I don't need to mess with real cpu threads and complicated synchronization.
One thing I haven't figured is how to keep the state of each fake thread. Closure env will probably do but I might need to keep extra state other than the local variables. Also a consideration is if I could use a thread local allocator. My question is if it's possible to have more control over the closure's internals. Does the .liftlocals pragma assist with that? I can find a complete example other than a couple of tests.
Sorry if the question sounds silly but I still have a lot of details to figure out. Thanks for your time!
liftLocals was my attempt to expose a lower level construct so that people can implement their own closure mechanism but it was never used successfully anywhere afaict. So you're better off with the Nim closures the compiler offers you.
There is no way to influence where these closures are allocated and nobody ever proposed a design for it either.
You should also look into std/tasks, this one is easy to adapt for custom allocation.