nimforum mirror - Will --gc:arc address finalizers for closure iterator variables?

spip (orginal) [2020-01-25T21:59:07+01:00] view original

Well that's not how it works, when the closure iterator's refcount is zero and thus provably dead the attached resources will be freed.

Like I said, the problem is not only freeing memory but also releasing shared resources. How can I intercept such event I write code to release the resources?

It is a bit tricky to exhibit a small example of the problem, showing that iterator destruction/finalization not only related to memory.

## Demonstrate that resources leak when allocated in
## closure iterator and the iterator loop is aborted.
## In that example, the resource is a thread.

import os
import strutils

type
  TArgs = object
    max: int

var ch: Channel[int]

proc foo(args: TArgs) {. thread .} =
  ## Iterate up to max.
  var
    max = args.max
    i = 0
  
  while i < max:
    ch.send(i)
    echo "Sent ", i
    # Pause between message so threads play nice
    sleep(100)
    inc(i)
  
  # Iterator complete
  ch.send(-1)
  echo "Thread completed!"


proc bar(m: int): iterator: int =
  ## Create a closure iterator
  iterator iter: int {.closure.} =
    var t: Thread[TArgs]
    
    # Calculate one result at a time
    ch.open(1)
    
    # Delegate to counting thread
    var args: TArgs = TArgs(max: m)
    createThread[TArgs](t, foo, args)
    
    while true:
      let val = recv(ch)
      echo "Received ", val
      sleep(100)
      if val == -1:
        # Iterator max reached: break loop
        break
      yield val
    
    joinThread(t)
    ch.close()
  
  result = iter


proc main =
  let max = parseInt(paramStr(1))
  echo "Counting up to ", max
  let iter = bar(max)
  for i in iter():
    echo "i=", i
    if i >= 10:
      echo "Aborting iteration in main after 10 items..."
      sleep(1_000)
      break
  echo "Out of iterator in main"


# Not using global variables or iterator
main()
# Force GC collection to be sure that out of scope
# `iter` variable has been claimed by GC.
GC_fullCollect()
echo "GC Stats: ", GC_getStatistics()

The iterator uses a thread to do calculations. You call it with the maximum number of calculations, but if > 10 it will abort the loop and terminate.

I'm using nim '#head' from choosenim:


$ nim --version
Nim Compiler Version 1.1.1 [Linux: amd64]
Compiled at 2020-01-11
Copyright (c) 2006-2019 by Andreas Rumpf

active boot switches: -d:release

This is compiled with nim c --gc:arc --threads -d:useMalloc poc.nim

Running a loop of 5 calculations (the iterator completes) shows that there is no memory leak.


$ valgrind ./poc 5
==5479== Memcheck, a memory error detector
==5479== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==5479== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==5479== Command: ./poc 5
==5479==
Counting up to 5
Sent 0
Received 0
Sent 1
i=0
Received 1
Sent 2
i=1
Received 2
Sent 3
i=2
Received 3
Sent 4
i=3
Received 4
Thread completed!
i=4
Received -1
Out of iterator in main
GC Stats: [GC] total memory: 0
[GC] occupied memory: 0

==5479==
==5479== HEAP SUMMARY:
==5479==     in use at exit: 0 bytes in 0 blocks
==5479==   total heap usage: 51 allocs, 51 frees, 9,841 bytes allocated
==5479==
==5479== All heap blocks were freed -- no leaks are possible
==5479==
==5479== For counts of detected and suppressed errors, rerun with: -v
==5479== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If we try with a loop of 15 calculations (it will be aborted at 10!), valgrind complains that there is a possible leak: in fact, the thread is still alive when the program completes.


$ valgrind ./poc 15
==5490== Memcheck, a memory error detector
==5490== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==5490== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==5490== Command: ./poc 15
==5490==
Counting up to 15
Sent 0
Received 0
...
Received 10
Sent 11
i=10
Aborting iteration in main after 10 items...
Out of iterator in main
GC Stats: [GC] total memory: 0
[GC] occupied memory: 0

==5490==
==5490== HEAP SUMMARY:
==5490==     in use at exit: 337 bytes in 2 blocks
==5490==   total heap usage: 87 allocs, 85 frees, 11,052 bytes allocated
==5490==
==5490== LEAK SUMMARY:
==5490==    definitely lost: 0 bytes in 0 blocks
==5490==    indirectly lost: 0 bytes in 0 blocks
==5490==      possibly lost: 288 bytes in 1 blocks
==5490==    still reachable: 49 bytes in 1 blocks
==5490==         suppressed: 0 bytes in 0 blocks
==5490== Rerun with --leak-check=full to see details of leaked memory
==5490==
==5490== For counts of detected and suppressed errors, rerun with: -v
==5490== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Imagine that I have a pool of 10 threads and I'm creating 10 iterators that are all 10 aborted before completion. The program will have consumed the 10 threads of the pool and exhausted all the resources. If instead of threads, the resources are managed by a database or the OS, this can be problematic.

If I were able to know when the iterator variable is destroyed, I could send a message to the delegated thread to end itself.

This type of problem can be related to transactions: you want a block of code to either succeed or fail, and if it fails to be able to take some actions for clean abort. The try: ... except: ... finally: ... works only when the cause of failure is an exception. In the case of an iterator, there is generally no exception and that's not really a failure but becoming out of scope, and we need a way to take actions when this happens.

Python has generalized that with context managers, but I don't know if they are used with generators.

The =destroy[T](x: ref T) model is quite nice and simple, but:

It does not apply to iterator types presently.

If it were working for closures, you have to manage yourself the state of the iterator (completed or aborted). And you have to manage yourself a reference to the resources.

Again if it were working, failure management (rollback or releasing resources...) is not locally near the place in the source code when the resources were leased. That's what Python with or Nim defer try to support.

Mirror of forum.nim-lang.org

5845 :: Will --gc:arc address finalizers for closure iterator variables?