There have been a few webassembly threads already, but it's a bit unclear to me what's the current status of targeting webassembly. For instance this thread mentions:
Nims garbage collector only works if the Nim runtime controls the root of the call stack, if I understand the experts correctly. That's normally not the case in the browser, but the fact that Nims GC is thread-local might save us here...
How exactly can we setup a project so that "running at the root of the call stack" is accomplished? Does compiling via emscripten and loading the resulting JS glue code satisfy this criterion or not?
@bluenote:
Does compiling via emscripten and loading the resulting JS glue code satisfy this criterion or not?
It seems to work as to satisfying this criterion as in the code I posted here.
The biggest problem in coding for WebAssembly use is handling data that needs to cross over between WebAssembly and JavaScript, but my experiments with that code seem to indicate it's possible through the emscripten facilities even though it may be messy.
Yes, you have to consider GC as JavaScript has its own and the WebAssembly code may also have its own if you use Nim's GC, but it doesn't seem too bad, at least until you try to use WebAssembly multi-threaded.
It seems to work...
Thanks for the link. Then maybe my question is: Under which circumstances would the Nim runtime not run at the root of the call stack? I still don't clearly see the patterns where Nim's GC can / cannot work.
Does the EMSCRIPTEN_KEEPALIVE play a role in this? What about taking the no-emscripten path as suggested by @arnetheduck and @yglukhov here -- does the emscripten magic have an impact on the GC question?
Under which circumstances would the Nim runtime not run at the root of the call stack?
Normally when you compile your program with all the default options the GC can potentially kick in on any allocation. When compiling to asm.js/wasm through emscripten or clang you have to take special care to not let the GC happen "whenever", but only to let it happen when there are no objects referred only from the stack, because such objects will be mistakenly collected. Stack bottom is kinda guaranteed to be such a place that satisfies this requirement. Most obvious stack bottoms are nim functions which are called immediately from JS. As an example here's some nimx code which runs the GC once per frame.
Does the EMSCRIPTEN_KEEPALIVE play a role in this?
No. Totally irrelevant.
What about taking the no-emscripten path...
The issue is the same with emscripten asm.js/wasm, and with pure clang wasm.
Hopefully that makes it clear :)
@bluenote: Just to augment what @yglukhov as said:
Does the EMSCRIPTEN_KEEPALIVE play a role in this?
All this flag does is to prevent the clang c/c++ compiler from dead-code-eliminating the native code wasm proc which otherwise wouldn't have any referenced if it is only referenced from the JavaScript side.
As to GC issues, through the "emscripten magic" it seems the Nim GC is able to treat the provided memory as its heap so that GC is made to appear normal to the Nim code; the problem is when one tries to FFI share memory between Nim and JavaScript. The emscripten documentation is actually quite good at describing the standard things that can be done to allow this, but of course because c/c++ doesn't have a GC, it must be extended a bit when a Nim GC is active.
Or at least that's as I understand it and it seems to work...
the problem is when one tries to FFI share memory between Nim and JavaScript
This is somewhat true, but is orthogonal to the stack bottom question. The ffi problems are mostly solved in jsbind package.
When compiling to asm.js/wasm through emscripten or clang you have to take special care to not let the GC happen "whenever", but only to let it happen when there are no objects referred only from the stack
Isn't it the standard case that objects are only referred to from the stack? Like when we create a seq/string the data is on the heap, but it is referred to only from the stack, right?
proc inner() =
for _ in 1 .. 1_000:
# may trigger GC here
let c = newSeq[int](1000)
proc subFunc() =
let b = newSeq[int](1000)
inner()
# does `b` still live?
proc mainEntryPoint() =
let a = newSeq[int](1000)
subFunc()
# does `a` still live?
Let's assume we only call mainEntryPoint from JS. Is this the case you mean -- we are triggering GC from an inner function that is not at the stack bottom, and the data of a and b is only referred to from the stack. If I understand your comment correctly they might get collected, and to prevent them from being collected I would have to push some references to a and b to the heap?