Hello everybody, I am still working on threadButler and based on suggestions from mratsim and leorize am starting to take a look at it through various tools to look for memory leaks etc.
I am currently trying to wrap my head around using address sanitizers which should help with figuring out some memory things (another is valgrind/heaptrack).
I am... unsure how to interpret some of the messages so I would like to ask for some interpretation help here, as this seems to go far deeper than I have any level of insight to.
I had an example ready and ran it with:
nim r --cc:clang --mm:arc -d:release --debugger:native -d:useMalloc --passc:-fsanitize=address --passl:-fsanitize=address examples/ex_stdinput.nim
The biggest memory leak
Indirect leak of 57352 byte(s) in 1 object(s) allocated from:
#0 0x55c1436e1de1 in __interceptor_calloc (/home/philipp/.cache/nim/ex_stdinput_r/ex_stdinput_DD41B6791D32A0B925C6208AC76777EEE30185AD+0x116de1) (BuildId: e7b10045252e22912e59f7086d94388eca9eb9eb)
#1 0x55c143733a35 in newSeqPayload /home/philipp/.choosenim/toolchains/nim-2.0.2/lib/system/seqs_v2.nim:44:767
#2 0x55c14373d8f4 in newSeq__pureZasyncdispatch_u1664 /home/philipp/.choosenim/toolchains/nim-2.0.2/lib/system/seqs_v2.nim:140:2
#3 0x55c14373d8f4 in newSeq__pureZasyncdispatch_u1660 /home/philipp/.choosenim/toolchains/nim-2.0.2/lib/system.nim:631:2
#4 0x55c143754462 in newSelector__pureZasyncdispatch_u1639 /home/philipp/.choosenim/toolchains/nim-2.0.2/lib/pure/ioselects/ioselectors_epoll.nim:101:196
#5 0x55c143759bf1 in newDispatcher__pureZasyncdispatch_u1634 /home/philipp/.choosenim/toolchains/nim-2.0.2/lib/pure/asyncdispatch.nim:1209:23
#6 0x55c143758ee8 in getGlobalDispatcher__pureZasyncdispatch_u2349 /home/philipp/.choosenim/toolchains/nim-2.0.2/lib/pure/asyncdispatch.nim:1237:39
#7 0x55c143777f97 in serverProc__ex95stdinput_u2576 /home/philipp/dev/threadbutler/src/threadButler.nim:72:2
Now I am mildly confused by this output. It refers to line 72 of my own threadButler.nim doing... something that causes a leak.
The code it refers to is this (line 72 can be found at the bottom of the codeblock):
proc runServerLoop[Msg](data: Server[Msg]) {.gcsafe.} =
mixin routeMessage
while IS_RUNNING:
var msg: Option[Msg] = data.hub.readMsg(Msg)
try:
while msg.isSome():
{.gcsafe.}:
routeMessage(msg.get(), data.hub)
msg = data.hub.readMsg(Msg)
except KillError:
break
except CatchableError as e:
error "Message caused exception", msg = msg.get()[], error = e.repr
if hasPendingOperations():
poll(0)
else:
sleep(data.sleepMs)
proc serverProc*[Msg](data: Server[Msg]) {.gcsafe.} =
mixin runServerLoop
data.startUp.execEvents()
runServerLoop[Msg](data) ## This is line 72
data.shutDown.execEvents()
My guess is that the nim-compiler saw the poll call in runServer and implicitly called getGlobalDispatcher which implicitly spawned said global dispatcher which somehow causes a memory leak.
Based on that my questions:
I guess in lieu of a better question:
Does anybody actually use address-sanitizers or valgrind or heaptrack and eradicates potential memory leaks to the point that nothing shows up while they run?
The repeated attempts I've made at using these tools always lead me to areas where I never understood what the supposed solution is. Like a memory leak reported by heaptrack from readLine. That one just makes me assume that heaptrack is faulty because I trust the output is getting accurately collected.
I'm developing similar vibes for address sanitizers that they contain a decent chunk of "false positives".
Does anybody actually use address-sanitizers or valgrind or heaptrack and eradicates potential memory leaks to the point that nothing shows up while they run?
Yes, I use sanitizers when developing:
However:
In your case you call asyncdispatch getGlobalDispatcher and it's possible that global ref are not freed at the end of the program. The C codegen should be checked here, in NimMain, PreMainInner and the exit procs.
In your case the leak points to https://github.com/nim-lang/Nim/blob/v2.0.2/lib/pure/ioselects/ioselectors_epoll.nim#L101
proc newSelector*[T](): Selector[T] =
proc initialNumFD(): int {.inline.} =
when defined(nuttx):
result = NEPOLL_MAX
else:
result = 1024
# Retrieve the maximum fd count (for current OS) via getrlimit()
var maxFD = maxDescriptors()
doAssert(maxFD > 0)
# Start with a reasonable size, checkFd() will grow this on demand
let numFD = initialNumFD()
var epollFD = epoll_create1(O_CLOEXEC)
if epollFD < 0:
raiseOSError(osLastError())
when hasThreadSupport:
result = cast[Selector[T]](allocShared0(sizeof(SelectorImpl[T])))
result.epollFD = epollFD
result.maxFD = maxFD
result.numFD = numFD
result.fds = allocSharedArray[SelectorKey[T]](numFD)
else:
result = Selector[T]()
result.epollFD = epollFD
result.maxFD = maxFD
result.numFD = numFD
result.fds = newSeq[SelectorKey[T]](numFD)
This is indeed instantiated from asyncdispatch: https://github.com/nim-lang/Nim/blob/v2.0.2/lib/pure/asyncdispatch.nim#L1207-L1209
proc newDispatcher*(): owned(PDispatcher) =
new result
result.selector = newSelector[AsyncData]()
And related to the global dispatcher: https://github.com/nim-lang/Nim/blob/v2.0.2/lib/pure/asyncdispatch.nim#L1228-L1239
proc setGlobalDispatcher*(disp: owned PDispatcher) =
if not gDisp.isNil:
assert gDisp.callbacks.len == 0
gDisp = disp
initCallSoonProc()
proc getGlobalDispatcher*(): PDispatcher =
if gDisp.isNil:
setGlobalDispatcher(newDispatcher())
when defined(nuttx):
addFinalyzer()
result = gDisp
And if we look further how the global dispatcher is defined:
https://github.com/nim-lang/Nim/blob/v2.0.2/lib/pure/asyncdispatch.nim#L356
var gDisp{.threadvar.}: owned PDispatcher ## Global dispatcher
It's a thread-local variable.
Or rather, gut feeling says that this is address sanitizer not seeing the destruction of the global dispatcher and thus falsely claiming a memory leak.
It is a memory leak, if the gDisp thread is joined, memory becomes unclaimable.
2 If it is a leak, what is supposed to be done here?
Raising a bug.
I'm not too sure what's the correct fix here.
Maybe having a magic destroyThreadLocalVariables that can be called before joinThread or exiting a program.
Thank you very much for the insight!
I have subsequently opened up an issue regarding this as suggested: https://github.com/nim-lang/Nim/issues/23165
As a current workaround, as I just found out, you can apparently just nil the dispatcher yourself before the threads join:
getGlobalDispatcher(nil)
That gets rid of all leaks related to the dispatcher that I had so far.