import threadpool
{.experimental.}
proc f(id: int) {.thread.} =
var str = "test(" & $id & ") "
parallel:
spawn f(1)
spawn f(2)
f(3)
When I run this with nim c --threads:on --gc:boehm test.nim it crashes with the folowing message:
Exclusion ranges overlap
...
Exclusion ranges overlap
Exclusion ranges overlap
Traceback (most recent call last)
threadpool.nim(300) slave
test.nim(12) fWrapper
test.nim(7) f
mmdisp.nim(108) allocAtomic
mmdisp.nim(108) allocAtomic
...
mmdisp.nim(108) allocAtomic
SIGABRT: Abnormal termination.
Could it be that Boehm GC support in Nim is broken ? Or am I missing something ?
This looks more like a broken installation of the Boehm GC. Basically, the Boehm GC has a routine (GC_exclude_static_roots()) that allows you to specify whether a certain area of memory should not be scanned for GC roots. Nim does NOT use this, however, the Boehm GC uses it internally. The error message that you are getting ("Exclusion ranges overlap") means that the Boehm GC is getting somewhat confused about the memory areas it is telling itself about.
What version of the Boehm GC are you using?
@Jehan: Thanks for the reply.
A broken installation could well be the problem. I'm using Ubuntu with the standard libgc.so.1.0.3. But I'm running it inside a VirtualBox image. Maybe libgc does not like that.
I did test it on native linux machine by now and on that it seems to work fine.
So, it must indeed be a problem with my installation or with the VirtualBox virtualisation.
Next problem ...
I know that sharing of data between threads can be problematic, but I was under the impression that using createShared in combination with the Boehm collector should work. (I'm not sure if that is a correct assumption).
So I made a small test program. It creates a very small linked list that is shared between the main thread and two other threads. Both threads grab the second node and change it. So they typically have a reference to a node created in the other thread. That is a problematic case for a thread local GC and indeed it is very easy to make this crash. I think it might work with the Boehm collector, but that also gives an error.
import os, threadpool, locks
{.experimental.}
type
Node = object
name : string
next : ptr Node
var nodeLck : TLock
proc changerLoop(node : ptr Node; thrName: string) {.thread.} =
for t in 0 .. 300_000:
acquire(nodeLck)
let
oldNode = node.next
newNode = createShared(Node)
newNode.name = thrName & "." & $t
node.next = newNode
if t mod 500 == 0:
echo thrName, ": ", t, " - ", oldNode.name
release(nodeLck)
var
a = createShared(Node)
b = createShared(Node)
a.name = "a"
a.next = b
b.name = "b"
initLock(nodeLck)
parallel:
spawn changerLoop(a, "T1")
sleep(123)
spawn changerLoop(a, "T2")
Running this code using nim c -r --threads:on --gc:boehm par.nim gives the followin error:
...
T1: 1000 - T1.999
T1: 1500 - T1.1499
T1: 2000 - T1.1999
Collecting from unknown thread
Traceback (most recent call last)
threadpool.nim(300) slave
par.nim(31) changerLoopWrapper
par.nim(17) changerLoop
mmdisp.nim(108) allocAtomic
SIGABRT: Abnormal termination.
Error: execution of an external program failed
It looks like the boehm collector has a problem with dealocating over multiple threads ?
@Jehan: I manually patched my local files based on your proposed fix. That seems to fix the problem of the first example. It also works fine in VirtualBox with that fix :-)
The Collecting from unknown thread problem from the second example is still there. :-(
proc boehmAllowRegisterThreads {.importc: "GC_allow_register_threads",
dynlib: boehmLib.}
type
GCStackBase = tuple
mem_base : pointer # Base of memory stack
reg_base : pointer # Base of separate register stack
proc boehmGetStackBase(base: ptr GCStackBase): int
{.importc: "GC_get_stack_base", dynlib: boehmLib.}
proc boehmRegisterMyThread(base: ptr GCStackBase): int
{.importc: "GC_register_my_thread", dynlib: boehmLib.}
proc boehmRegisterThisThread*() =
var base : GCStackBase
discard boehmGetStackBase(addr base)
discard boehmRegisterMyThread(addr base)
I did add a call to boehmAllowRegisterThreads after the GCinit. I did add a call to boehmRegisterThisThread at the beginning of the thread code in my second example. That seems to do the trick :-) It works fine with that.
I think that the threads should be registered with the Boehm library this way. I'm not sure where it should be called in the Nim code though.
@Jehan: Thanks for all of this work.
I hope the pull request gets in soon :-)
Until it's been reviewed and incorporated, you can just download the patch directly and apply it, just append .diff or .patch to the PR URL:
wget https://patch-diff.githubusercontent.com/raw/nim-lang/Nim/pull/3292.patch
git apply 3292.patch
I did apply the patch and have been hammering on the Boehm collector with the ugly code below. That's almost the same code as the previous example, but this time using the normal ref and the parallel / spawn construction.
Everything seems to be working OK.
Thanks again for this patch.
import os, threadpool, locks
{.experimental.}
# nim c -r --threads:on -d:release --gc:boehm test.nim
type
Node = ref object
name : string
next : Node
var
loopLck : Lock
proc loop(node: var Node; thrName: string) =
for t in 0 .. 300_000:
sleep(10)
acquire(loopLck)
let
oldNode = node.next
newNode = Node(name: thrName & "." & $t)
node.next = newNode
if t mod 100 == 0:
echo thrName, ": ", t,
" - old node: ", oldNode.name
release(loopLck)
var str = " "
for _ in 0 .. 1_000: str = str & " "
proc main () =
var
b = Node(name: "b")
a = Node(name: "a", next: b)
initLock(loopLck)
parallel:
for ix in 1 .. 8:
spawn loop(a, "Thr" & $ix)
sleep(33)
when isMainModule:
main()