Hi.
I have a problem with understanding how Nim GC exactly works. I wrote simple program and after few minutes it reached to 15 MB of ram, and after another few minutes a program finally crashed. What i'm doing wrong? How can i fix it? I do not understand what i'm doing wrong. In other GC languages (java, python etc.) it could work perfectly fine, but in Nim it seems allocating memory infinitely without releasing it.
nim --version
Nim Compiler Version 0.11.2 (2015-05-04) [Windows: amd64]
Copyright (c) 2006-2015 by Andreas Rumpf
git hash: 3bef848a2cf60008f23e72571d7c20c0eb9fd728
active boot switches: -d:release
import sets
import os
type MyStruct = ref object
field1: HashSet[string]
field2: HashSet[string]
proc bigSet(): MyStruct =
new(result)
var s1 = initSet[string]()
for i in 0..1000:
s1.incl($i & "After becoming the .NET Compact Framework (.NETCF) dynamic memory management module owner I am continually learning a lot about the subject. Based on hallway discussion I figured out that a lot of developers are not very clear about the subject and would like to learn more. Most online material I encountered gets into too much of details and implementation for most casual readers. ")
var s2 = initSet[string]()
for i in 1000..2000:
s2.incl($i & "123 After becoming the .NET Compact Framework (.NETCF) dynamic memory 123 management module owner I am continually learning a lot about the subject. Based on hallway discussion I figured out that a lot of developers are not very clear about the subject and would like to learn more. Most online material I encountered gets into too much of details and implementation for most casual readers. ")
result.field1 = s1
result.field2 = s2
for i in 0..1000000:
let s = bigSet()
if i div 100 == 0:
echo s.field1
echo s.field2
GC_fullCollect()
sleep(300)
quit 0
First of all, 15MB sounds reasonable in debug mode. I'm getting 16-20MB on OS X with 64-bit myself. You're generating two sets that require over 1MB of memory each, in order to print them, strings of about the same size need to be generated; and in debug mode, two copies of each may be kept alive, and the GC needs some breathing room on top of that.
That said, it should stabilize at this point (and should definitely not crash). For me, it does work fine, so something else must be going wrong. I am not clear what exactly causes the crash on your machine, though; does the program actually start using an indefinite amount of memory (i.e. can you observe it?), or does it crash for some other reason?
I will investigate crash problem in the future.
But for me the greatest problem is big memory consumption. I want use (possibly) big sets in machine with very low memory and for me 5MB-10MB is greatest value that is acceptable for my program. Consumption above 10MB is definitely not acceptable in my situation. It would be perfectly ok if memory consumption will be ~5MB and around around this value. Are you suggesting launching this in release mode would decrease memory consumption? Can you suggest other things I can do to decrease memory consumption? Maybe Nim is not the right tool to do this kinds of work, but personally I do not like to write directly in C.
Maybe Nim is not the right tool to do this kinds of work, but personally I do not like to write directly in C.
You can always write C-like code in Nim. I don't know why people think they can't. I really don't. There is a trivial 1 to 1 mapping to C if you're after that except that you get better type checking from Nim, tons of more checks in debug mode and don't have to mess with header files. The Nim code base has no hand written C except nimbase.h and even that we could easily do without.
I want use (possibly) big sets in machine with very low memory and for me 5MB-10MB is greatest value that is acceptable for my program.
I don't see any problems here. If others can do it, so can Nim. That said, maybe you should use - e.g. - SQLite? Databases are quite good at abstracting over the RAM vs disk distinctions. But let me guess .. you don't have a hard disk on this machine either? ;-)
fapik: Are you suggesting launching this in release mode would decrease memory consumption?
Not quite (because it can't be guaranteed). What I'm pointing out is that you create huge data structures (and their string representations) and then discard them by ignoring them. Having large data structures with unpredictable lifetimes is where your primary problem lies.
The standard sets implementation is also not very suitable for tight memory requirements, since it uses dynamic length arrays; if you use a GC under these circumstances, you want individual blocks to be relatively small. A tree-based implementation of sets would be better (a quick experiment using red-black trees with a dummy implementation of the $ operator resulted in memory-usage at around 3MB for me.
That said -- if you're working that close to your memory limit, a garbage collector is a risky proposition to begin with. You probably want to use manual memory management to be able to obtain hard(er) constraints on your memory usage, as suggested by Araq.
@kirbyfan64sos I'm developing on windows, but I will do cross-compilation to target PowerPc Linux 256MB Ram.
@Araq I was thinking about Sqlite but I finally abandon that idea because I want doing heavy disk IO operation, and I want do possible simplest solution without unnecessary dependency.
You can always write C-like code in Nim.
Can you recommend some valuable tutorials focused how to write C-like code in Nim? That will be very helpful.
@Jehan Red-black trees looks promising. I will try implement set using Red-black trees. Thank you for hint.
fapik: I will try implement set using Red-black trees.
I wrote a very basic implementation of red-black trees here, which may serve you as a starting point. (Note: there's no init function, a nil reference to a a red-black tree represents the empty set).
@araq said:
"You can always write C-like code in Nim. I don't know why people think they can't. I really don't."
I think that the fact that nim has a GC trips and scares people off. In one of the slides of your recent O'Reilly talk you said "string, ref, seq and closure are GC'ed, nothing else". My guess is that if people where more aware of that fact they would be less inclined to think this way. Maybe this could be stressed even more (and earlier, as soon as the GC is introduced) on the tutorials and in the manual?
Of course you would then need to avoid using those types at all, and that may mean not using a big chunk of the standard library...
That is why I really wish that there were a way to tell nim that you want to allocate those types (or at least strings and seqs, and perhaps also refs) on the stack and/or at least deallocate them as soon as you get out of their scope. My hope is that this would let you avoid the GC pauses altogether, which for certain applications (e.g. real-time, embedded software) makes nim unusable. Basically I'd like to be able to make nim behave more like C++ or rust. I know that you can use alloc and dealloc, but it would be much nicer and less error prone to not have to do the explicit deallocation. What I´m thinking is being able to do something like:
proc myproc()
local let myseq = @[1, 2, 3]
echo myseq
myproc()
And then be sure that when calling myproc(), seq would be automatically and immediately deallocated as soon as we exit myproc().
Maybe this does not make sense at all though, in which case this reply would just be another proof that people get confused as soon as a GC is involved :-P
Maybe this does not make sense at all though, in which case this reply would just be another proof that people get confused as soon as a GC is involved
Well it certainly it another proof that people won't read what I write here anyway.
My hope is that this would let you avoid the GC pauses altogether, which for certain applications (e.g. real-time, embedded software) makes nim unusable.
Nim is not unusable for these domains at all and I'd like to read a single good reason for these claims. Ignorance is not a reason. And again, use glib's tree library, create a big tree and then freeing it will produce a "pause". No GC required for pauses.
Of course you would then need to avoid using those types at all, and that may mean not using a big chunk of the standard library...
Oh yeah, that's of course a good reason to use C instead which has no real stdlib to begin with...
Nim's GC is per thread: You can easily have threads which are hard realtime and don't use the GC and happily use the GC in the other threads... And the type and effect system ensures you don't mix these different per thread heaps. This is really a terrific design and I am beginning to think Nim is still the most misunderstood language out there.
Araq: This is really a terrific design and I am beginning to think Nim is still the most misunderstood language out there.
As a friend of mine often says, no good deed goes unpunished.
Perhaps your appearance at OSCON will help garner interest and clarity.
Maybe this does not make sense at all though, in which case this reply would just be another proof that people get confused as soon as a GC is involved
Well it certainly it another proof that people won't read what I write here anyway.
Would you mind being a bit more specific about what you are referring to? Is it something that you have written to me specifically which I didn't read or didn't understand? I assure you that I'd like to understand whatever I missed.
My hope is that this would let you avoid the GC pauses altogether, which for certain applications (e.g. real-time, embedded software) makes nim unusable.
Nim is not unusable for these domains at all and I'd like to read a single good reason for these claims. Ignorance is not a reason. And again, use glib's tree library, create a big tree and then freeing it will produce a "pause". No GC required for pauses.
A while ago I discussed GC pauses with you on this forum. My understanding from that thread is that getting guaranteed GC pauses in the sub-millisecond (e.g. 100 us or less) range is not possible. When the system "tick" is 1 ms as is our case (by which I mean that every 1 ms we must do some deterministic, non delayable amount of work), having GC pauses in the 1 ms order of magnitude is not an option.
As you say, regular memory allocation, deallocation operations can introduce pauses as well. However the moment where those pauses happens is deterministic (when the variable gets out of scope or when you manually free it). This determinism makes the whole difference in the embedded, real-time use case.
That being said we do avoid that sort of thing by doing most of our allocations (including setting the capacity of std:vector and friends) at startup time.
Of course you would then need to avoid using those types at all, and that may mean not using a big chunk of the standard library...
Oh yeah, that's of course a good reason to use C instead which has no real stdlib to begin with...
I never said that is the case. We use C++ for most of our code. The C++ STL is not perfect nor complete but for the kind of things we do it is (barely) sufficient. In any case I don't particularly like C++! Particularly, the Cx03 version of the C++ language that we are forced to use because our board vendor only supports GCC 4.3 is terrible. I am not looking for reasons to use C++. On the contrary I would love to use nim instead! I just feel I can't for the reasons I stated above (long, non determinstic GC pauses).
Nim's GC is per thread: You can easily have threads which are hard realtime and don't use the GC and happily use the GC in the other threads... And the type and effect system ensures you don't mix these different per thread heaps. This is really a terrific design and I am beginning to think Nim is still the most misunderstood language out there.
Now I'm the one that feels misunderstood. I really, really like nim. I admire its design. I find that the language is terrific. I've gone through its manual, I've listened to some of your talks, I've read many threads on this forum, I've made a few small little programs with it. I believe I can replace a lot of my python work with nim.
Yet, I feel that in order to use it for what I'd like to use it the most (embedded, sub 1 ms real-time development) I would need to avoid using an important part of what makes nim practical (some of its nice stdlib, the useful and expressive seq type, strings...). Maybe I am wrong, but so far I don't see how.
Could we use nim as it is right now? Yes. Would a GC-less nim be better than what we use now? Probably, but the case would be much stronger if we could use the whole language.
The way I see it, currently nim gives you 2 options:
In both cases you can also do manual (alloc / dealloc) unsafe memory handling (lets call this option #3).
What I would like is to have a 4th option, which is to tell nim to deallocate certain variables when they are out of scope (a la rust or C++). I don't think this is currently possible but maybe I am wrong. I think if nim could do this then it would be usable exactly everywhere C and C++ can be used today.
A while ago I discussed GC pauses with you on this forum.
Ah sorry, wasn't aware that it was you. ;-)
Could we use nim as it is right now? Yes. Would a GC-less nim be better than what we use now? Probably
Well the pause time depends on:
So ... in my opinion that's the real problem: If I tell you "got it to 0.1msec pause time!" that's rather meaningless cause I use different CPUs and workloads. You have to try it in practice and then report back "doesn't work because of ...". I am aware that it might be unfeasible to do this experiment, but if you don't I cannot see how you can claim "cannot use it for hard realtime" either.
And finally, the local feature for strings and sequences can be done with an escape analysis, it doesn't require yet another pragma. (Though a pragma could be useful as an annotation so that the compiler can report "in my opinion it does escape".)
After this changes my test program is rock solid in terms of memory usage, and I do not see any weird memory behaviour.
Below is improved code. I will try prepare in the future pull request to Nim main line, to support reset method, and generating string using external buffer.
import
os, hashes, math
{.pragma: myShallow.}
when not defined(nimhygiene):
{.pragma: dirty.}
type
KeyValuePair[A] = tuple[hcode: THash, key: A]
KeyValuePairSeq[A] = seq[KeyValuePair[A]]
HashSet* {.myShallow.}[A] = object ## \
data: KeyValuePairSeq[A]
counter: int
{.deprecated: [TSet: HashSet].}
# THIS IS NEW METHOD IN HashSet
proc reset*[A](s: var HashSet[A]) =
s.counter = 0
for i in 0..high(s.data):
s.data[i].hcode = 0
s.data[i].key = nil
proc isEmpty(hcode: THash): bool {.inline.} =
result = hcode == 0
proc isFilled(hcode: THash): bool {.inline.} =
result = hcode != 0
proc isValid*[A](s: HashSet[A]): bool =
result = not s.data.isNil
proc len*[A](s: HashSet[A]): int =
result = s.counter
iterator items*[A](s: HashSet[A]): A =
assert s.isValid, "The set needs to be initialized."
for h in 0..high(s.data):
if isFilled(s.data[h].hcode): yield s.data[h].key
const
growthFactor = 2
proc mustRehash(length, counter: int): bool {.inline.} =
assert(length > counter)
result = (length * 2 < counter * 3) or (length - counter < 4)
proc nextTry(h, maxHash: THash): THash {.inline.} =
result = (h + 1) and maxHash
template rawGetKnownHCImpl() {.dirty.} =
var h: THash = hc and high(s.data) # start with real hash value
while isFilled(s.data[h].hcode):
if s.data[h].hcode == hc and s.data[h].key == key: # compare hc THEN key
return h
h = nextTry(h, high(s.data))
result = -1 - h # < 0 => MISSING; insert idx = -1 - result
template rawGetImpl() {.dirty.} =
hc = hash(key)
if hc == 0: # This almost never taken branch should be very predictable.
hc = 314159265 # Value doesn't matter; Any non-zero favorite is fine.
rawGetKnownHCImpl()
template rawInsertImpl() {.dirty.} =
data[h].key = key
data[h].hcode = hc
proc rawGetKnownHC[A](s: HashSet[A], key: A, hc: THash): int {.inline.} =
rawGetKnownHCImpl()
proc rawGet[A](s: HashSet[A], key: A, hc: var THash): int {.inline.} =
rawGetImpl()
proc contains*[A](s: HashSet[A], key: A): bool =
assert s.isValid, "The set needs to be initialized."
var hc: THash
var index = rawGet(s, key, hc)
result = index >= 0
proc rawInsert[A](s: var HashSet[A], data: var KeyValuePairSeq[A], key: A,
hc: THash, h: THash) =
rawInsertImpl()
var enlargeCount = 0
proc enlarge[A](s: var HashSet[A]) =
inc(enlargeCount)
var n: KeyValuePairSeq[A]
newSeq(n, len(s.data) * growthFactor)
swap(s.data, n) # n is now old seq
for i in countup(0, high(n)):
if isFilled(n[i].hcode):
var j = -1 - rawGetKnownHC(s, n[i].key, n[i].hcode)
rawInsert(s, s.data, n[i].key, n[i].hcode, j)
template inclImpl() {.dirty.} =
var hc: THash
var index = rawGet(s, key, hc)
if index < 0:
if mustRehash(len(s.data), s.counter):
enlarge(s)
index = rawGetKnownHC(s, key, hc)
rawInsert(s, s.data, key, hc, -1 - index)
inc(s.counter)
proc incl*[A](s: var HashSet[A], key: A) =
assert s.isValid, "The set needs to be initialized."
inclImpl()
proc incl*[A](s: var HashSet[A], other: HashSet[A]) =
assert s.isValid, "The set `s` needs to be initialized."
assert other.isValid, "The set `other` needs to be initialized."
for item in other: incl(s, item)
template doWhile(a: expr, b: stmt): stmt =
while true:
b
if not a: break
proc excl*[A](s: var HashSet[A], key: A) =
assert s.isValid, "The set needs to be initialized."
var hc: THash
var i = rawGet(s, key, hc)
var msk = high(s.data)
if i >= 0:
s.data[i].hcode = 0
dec(s.counter)
while true: # KnuthV3 Algo6.4R adapted for i=i+1 instead of i=i-1
var j = i # The correctness of this depends on (h+1) in nextTry,
var r = j # though may be adaptable to other simple sequences.
s.data[i].hcode = 0 # mark current EMPTY
doWhile ((i >= r and r > j) or (r > j and j > i) or (j > i and i >= r)):
i = (i + 1) and msk # increment mod table size
if isEmpty(s.data[i].hcode): # end of collision cluster; So all done
return
r = s.data[i].hcode and msk # "home" location of key@i
shallowCopy(s.data[j], s.data[i]) # data[j] will be marked EMPTY next loop
proc excl*[A](s: var HashSet[A], other: HashSet[A]) =
assert s.isValid, "The set `s` needs to be initialized."
assert other.isValid, "The set `other` needs to be initialized."
for item in other: excl(s, item)
proc init*[A](s: var HashSet[A], initialSize=64) =
inc(enlargeCount)
assert isPowerOfTwo(initialSize)
s.counter = 0
newSeq(s.data, initialSize)
proc initSet*[A](initialSize=64): HashSet[A] =
result.init(initialSize)
template dollarImpl(): stmt {.dirty.} =
result = "{"
for key in items(s):
if result.len > 1: result.add(", ")
result.add($key)
result.add("}")
proc `$`*[A](s: HashSet[A]): string =
assert s.isValid, "The set needs to be initialized."
dollarImpl()
# THIS IS NEW METHOD IN HashSet
proc appendToBuff[A](buff: var string, buffFillLen: var int, toAppend: A) =
var toAppendStr = $toAppend
# make room in buffer
var newLen = buffFillLen + toAppendStr.len
if(buff.len < newLen):
# "align" 'newLen' to be multiply of 2
var alignedNewLen = nextPowerOfTwo(newLen)
buff.setLen(alignedNewLen)
for i in 0..high(toAppendStr):
buff[buffFillLen + i] = toAppendStr[i]
buffFillLen = newLen
# THIS IS NEW METHOD IN HashSet
proc buildString*[A](aSet: HashSet[A], buff: var string) =
var buffFillLen = 0
appendToBuff(buff, buffFillLen, "{")
for toInsert in items(aSet):
if buffFillLen > 1: appendToBuff(buff, buffFillLen, ", ")
appendToBuff(buff, buffFillLen, toInsert)
appendToBuff(buff, buffFillLen, "}\0")
import os
proc bigSet(s1: var HashSet[string], s2: var HashSet[string]) =
for i in 0..1000:
s1.incl($i & "After becoming the .NET Compact Framework (.NETCF) dynamic memory management module owner I am continually learning a lot about the subject. Based on hallway discussion I figured out that a lot of developers are not very clear about the subject and would like to learn more. Most online material I encountered gets into too much of details and implementation for most casual readers. ")
for i in 1000..2000:
s2.incl($i & "123 After becoming the .NET Compact Framework (.NETCF) dynamic memory 123 management module owner I am continually learning a lot about the subject. Based on hallway discussion I figured out that a lot of developers are not very clear about the subject and would like to learn more. Most online material I encountered gets into too much of details and implementation for most casual readers. ")
var s1: HashSet[string] = initSet[string](2048)
var s2: HashSet[string] = initSet[string](2048)
var strBuff = newString(2048 * 2)
for i in 0..1000000:
bigSet(s1, s2)
s1.buildString(strBuff)
echo strBuff
s2.buildString(strBuff)
echo strBuff
s1.reset()
s2.reset()
And final notes. Will be useful ability of manual freeing objects. But I have no idea how to do it right :( I think after adding this kind of ability we loose safety of the language. Also I'm not sure what semantic will be suitable. Manual freeing objects will be useful in "worker" pattern where some method get some data from queue and then work on them and produce result - data it is not needed anymore and can be safely freed. Sample pseudocode about manual memory freeing. I included in comments some thoughts about possible difficulty.
# PSEUDOCODE
var someObject = produceSomeTracedObject()
var someOtherReference = someObject
forceFreeObject(someObject) # This call could get 'someObject' reference to null? It also return memory to OS regardless some other objects has references to it or not.
echo someOtherReference # ?? segfault? I think raising nim exception with stack trace would be hard to implement in this case.
# PSEUDOCODE
var someObject = produceSomeTracedObject()
var someOtherReference = someObject
forceFreeObject(someObject) # ?? What will happen if GC runs just before this line and already freed this object?
forceFreeObject would be useful when GC is disabled, but then Std library should be improved to support working without GC - this would be huge work.
Other but more safe option.
# PSEUDOCODE
var someObject = produceSomeTracedObject()
var someOtherReference = someObject
GC_tryFreeObject(someObject) # This call could set 'gameObject' reference to null? It my also return memory to OS if GC detect that there is no other live references to this variable. I don't know if it is feasible implement this with good performance (checking if there are some live references would be expensive).
echo someOtherReference # Segfault is not possible in this case, because this is live reference to memory location.
Manual freeing memory is not 'must have' for Nim but could be nice feature which would make Nim easier controlling memory consumption. If this feature will be implemented then would be nice and probably will help people with 'hard real time' background. I also understand that adding new feature is huge work, and requires careful design.
Edit1: I slightly improve code - now memory consumption dropped to ~4.5 MB
I will try prepare in the future pull request to Nim main line, to support reset method
It was about time!