nimforum mirror - How to definitely free memory of seq?

tsojtsoj (orginal) [2021-04-26T16:25:28+02:00] view original

I have this little program where I basically create a new seq with many entries and then reassign a new seq of big size to the same variable. What I want is that the program then doesn't use the memory from that first sequence. But when I look with htop for the memory usage, it show that the program still uses all the memory.


import strutils

type HashTableEntry = object
    key: int
    value: float

type HashTable = object
    table: seq[HashTableEntry]

func newHashTable*(sizeInBytes: int): HashTable =
    let numEntries = sizeInBytes div sizeof(HashTableEntry)
    result.table = newSeq[HashTableEntry](numEntries)

func getIndex(ht: HashTable, key: int): int =
    key mod ht.table.len.int

var hashTable = newHashTable(140_000_000)

var key = readLine(stdin).parseInt
var index = hashTable.getIndex(key)
hashTable.table[index] = HashTableEntry(value: 1.0, key: key)

hashTable = newHashTable(128_000_000)

key = readLine(stdin).parseInt
index = hashTable.getIndex(key)
hashTable.table[index] = HashTableEntry(value: 1.0, key: key)

discard readLine(stdin)

When I run this and enter any number (for example 123) it shows a memory usage of ~260MB even though only ~128MB are accessible (I am compiling with the default GC). How can I make sure, that this program only uses at maximum ~140MB at any time?

Araq (orginal) [2021-04-26T17:32:46+02:00] view original

Use --gc:arc and Nim version 1.4.x or later.

tsojtsoj (orginal) [2021-04-26T17:43:16+02:00] view original

It doesn't change the behavior. I am using Nim 1.4.6 and compiling with nim c -d:release --gc:arc --run main.nim

Araq (orginal) [2021-04-26T18:09:49+02:00] view original

How do you measure the memory consumption? Also try with --gc:arc -d:useMalloc

tsojtsoj (orginal) [2021-04-26T18:20:08+02:00] view original

When using htop I look at "residentual memory". When I use -d:useMalloc it shows not 128MB as expected or 260MB like before, it shows 1504 bytes.

cblake (orginal) [2021-04-26T19:46:15+02:00] view original

Your OS/libc may or may not release the virtual memory pages back to the free pool and so show it as still part of the resident set size. glibc on Linux should...Sometimes things like /usr/bin/time will tell you only "peak RSS", not current. Similarly, htop may look at something else. Linux has grown a gajillion kinds of memory statistics..Just cat /proc/meminfo to get a taste. If you are using Linux+glibc then documentation may help in combination with --gc:arc -d:useMalloc mode.

tsojtsoj (orginal) [2021-04-26T20:24:53+02:00] view original

I can't test the minimal example I posted here, but the regarding the seemingly non released memory the real code behaves identical on Windows. When I use /usr/bin/time I get "Maximum resident set size (kbytes): 263504".

The program is a chess engine that isn't allowed to use excessively more RAM than a given limit. So I would like to eliminate any situation where my program uses more RAM that I want it to.

--gc:arc -d:useMalloc seems to work, but that unfortunately slows the program down quite a bit so I want to avoid that too.

shirleyquirk (orginal) [2021-04-26T20:46:28+02:00] view original

large_thing1 = large_thing_2 logically requires that for a brief moment they both exist at the same time.

modify hashtable in-place instead of creating a new one and sinking it over.

tsojtsoj (orginal) [2021-04-26T20:58:27+02:00] view original

Is there something similar to "std::vector<T>::resize" from C++ for seq[T]? I didn't find anything, but maybe I looked at the wrong places.

shirleyquirk (orginal) [2021-04-26T21:13:05+02:00] view original

yep, setLen

tsojtsoj (orginal) [2021-04-27T22:47:55+02:00] view original

I used nim c -d:danger --passC:"-flto" --passL:"-flto -static" --cc:clang --threads:on main.nim, if I add --gc:arc -d:useMalloc it is 4% slower (which is not terrible but I want to make sure that there isn't another solution that doesn't involve a performance penalty).

I didn't know about PGO, I'll try that.

(The real application is this: https://gitlab.com/tsoj/Nalwald)

juancarlospaco (orginal) [2021-04-27T23:08:16+02:00] view original

-d:lto, no "-flto"

tsojtsoj (orginal) [2021-04-27T23:18:03+02:00] view original

I tried out PGO it gives about 2% improvements when not using --gc:arc -d:useMalloc, when using it, it becomes even slower, almost 20%.

@juancarlospaco That's nice, now my compiling commands are slightly easier to read :)

cblake (orginal) [2021-04-28T00:58:52+02:00] view original

You may profit from profiling the code in the various modes, but order 5..10% differences (or more) can easily be code layout noise. That is easily perturbed as you add code/logic. Maybe Nalwald is toward the very end of its dev cycle..

Too bad the PGO didn't just work. I've seen it make things run slower on occasion, too. The job gcc/backend compiler is trying to do is really quite hard and also sometimes hard to steer. You might get quite different answers on AMD vs Intel vs. Intel 3 generations back, etc.

My advice/just my opinion would be that if you get the memory conservation properties that you want to not worry about a 10-20% perf delta..at least not until very, very final stage everything and after testing on true deployment targets. I mean, maybe you have and are at that stage, but it seemed like good advice to mention if you have not/are not. :-)

ynfle (orginal) [2021-04-28T01:31:29+02:00] view original

If you are using --gc:arc your top-level code should be wrapped in a main proc for better optimizations.

tsojtsoj (orginal) [2021-04-28T03:55:51+02:00] view original

Does it matter, whether it is called main or something else?

ynfle (orginal) [2021-04-28T07:59:55+02:00] view original

No.

Araq (orginal) [2021-04-28T08:40:07+02:00] view original

If you are using --gc:arc your top-level code should be wrapped in a main proc for better optimizations.

That applies much moreso for the older GCs than it does for --gc:arc.

ynfle (orginal) [2021-04-28T20:02:05+02:00] view original

Oh ok. Good to know

Mirror of forum.nim-lang.org

7862 :: How to definitely free memory of seq?