I have this code:
var a: seq[int64]
proc addItems() =
for k in 0..200000000:
a.add(k)
if k %% 1000000 == 0:
echo k
echo "a.high: " & $a.high
proc removeItems() =
for k in countdown(a.high,10):
discard a.pop()
if k %% 1000000 == 0:
echo k
addItems()
removeItems()
echo "Array:" & $a
echo "a.high: " & $a.high
sleep(10000)
After running addItems(), memory usage comes up to 3,4 GiB (and that's OK). What I consider strange is that after running removeItems(), memory stays at the same size. It's not released.
Am I missing something? GC isn't supposed to work in this case, or is it a memory leak?
I don't know much about Nim's Garbage-Collector, but for me behaviour looks not surprising.
First, it is generally recommended to put all the code inside of a proc, often called main(). That can improve performance, but I think that would make no difference for GC here.
Second, calling pop() on a seq will remove items from seq, but will not shrink the internally allocated buffer. If it would try to shrink buffer, then for each pop it had to allocate a smaller buffer, copy all elements there, and release old buffer. There may exists seq related procs that free unused buffers, maybe setlen() does it, but I don't think so. Generally assumption is, that when seq buffer was used once, it may be needed again, so size does not shrink.
Further, GC may keep unused memory reserved and give it not back to OS, as it may be needed very soon again. I don't know when that exactly occurs.
Last point, GC releases memory only when a new allocation is needed. You may try to force a collection by calling GC_fullcollect to force releasing of all unused memory. But even that op may not release all memory always, because stack scan is conservative, which may give false positives, so in rare cases deallocation is delayed.
I hope my explanation is not too wrong -- bright devs may explain it much better, but they may better work on V 1.0.
Note that Nim has many different Garbage Collectors, and that we may get destructors soon, which may allow us to avoid GC in many cases fully.
Thanks for suggestions and insightful explanation!
I've tried GC_fullcollect():
main()
var tmp = a
a = tmp
tmp = @[]
GC_fullCollect()
os.sleep(10000)
Memory usage comes down to 1,8 GiB. It's still high, but at least it looks like GC have done something. I've also tried different GCs, but either there's no difference or it's even higher.
So, basically there is no way to delete a variable (as for now)?
So, basically there is no way to delete a variable (as for now)?
Of course there is.
First, you may define your seq inside of a proc (or inside a block, which has its own local scope), so it can get collected when it goes out of scope. Generally it is a good idea to avoid global variables when possible. For (global) variables you may assign a new value like "mySeq = @[]", so old buffer is collected when next allocation occurs or you call GC_fullcollect.
Generally I can not imagine many real use cases where shrinking the internally buffer of a seq really is desired. Maybe setlen() can do it, you may look at the code. Seqs should work with destructors soon, so when they go out of scope they will get deallocated immediately then.
If you have a real use case where GC works not fine for you, then you may post a reference to your concrete example, and I assume you will get a reply from devs.
Did just a short test. And I am a bit surprised with default GC: I added to your code
echo a.len
echo getTotalMem()
echo getOccupiedMem()
echo getFreeMem()
a = @[]
GC_Fullcollect()
echo getTotalMem()
echo getOccupiedMem()
echo getFreeMem()
sleep(10000)
GC_Fullcollect()
echo getTotalMem()
echo getOccupiedMem()
echo getFreeMem()
and got output
10
3695615464
1743472784
1952124928
3695615464
1743454704
1952137216
3695615464
1743454920
1952137216
So it seems that GC does not give mem back to OS. I think that is intended, but may not be that what we desire in all cases.
Global variables are never collected as they never go out of scope
Indeed, but we can set them to nil or for case of string or seq assign virgin copy.
My first guess was indeed that his observation was related to global variable, so I tried a variant of his code, where I made the seq local to proc addItems(). Also for that case I got nearly identical values for getTotalMem() and friends. So it seems that default GC gives no mem back to OS. What seems to be intentional, as it is used for Java too.
PS: Do you know what happened with proc GC_get_Statistics? Was not able to find that again -- docs for GC.nim seems to be difficult to find currently, and in system.nim docs there is a link to GC_fullCollect(), but it does not work.
Is it 1GB per allocation, or total memory allocated?
Does that mean that the GC works like a memory pool and subsequent ref types new are "costless" from a memory point of view?
Will that change with destructors?
It means allocations >= 1GB are requested from the OS directly and returned back to the OS directly.
Does that mean that the GC works like a memory pool and subsequent ref types new are "costless" from a memory point of view?
Well the allocator asks the OS for a "chunk" of memory and subdivides it further. (Pretty much every allocator does this.)
Will that change with destructors?
It will be easier to use a custom allocator for seqs and strings but it's mostly orthogonal to destructors.