nimforum mirror - Comprehensive knowledge on Nim memory management (MM)

mardiyah (orginal) [2022-02-26T09:22:33+01:00] view original

Anyone really understand how Nim MM works ? what being asked is

The array of string pre allocation e.g. newSeqOfCap[ array[ 3, string ] ](50) How to definitively calculate the memory being allocated at the moment the command/statement is executed as the string won't determine the size and how will then it be related to the number 50

Please help how it is clarified crystal clear ? thanks in advance

ElegantBeef (orginal) [2022-02-26T09:23:52+01:00] view original

Generally speaking Nim's collections allocate enough for 64 elements by default, so you can use that as an assumption of the memory allocations unless manually specified.

PMunch (orginal) [2022-02-26T11:20:08+01:00] view original

We can do this with a bit of simple math. I'm assuming here that you're on a 64-bit machine so that pointers and integers are 8 bytes long. A sequence holds three things in a smalle stack-allocated object, the "capacity" which is the size of the buffer, "length" which is the current amount of elements in the buffer and a pointer to the "buffer" itself. This amounts to 24 bytes. The size of the buffer depends on how big the sequence is and how big your elements are. Since you use newSeqOfCap only that number will be allocated. So you allocate 50 elements. Each element is the same size, and arrays in Nim are static, so your elements will be 3x the length of a string. Strings in Nim are essentially just a specialised sequence. So they are also 24 bytes big. So each of your arrays will take up 3*24=72 bytes, and you have 50 of them so the buffer of the sequence will be allocated on the heap as 360 bytes. Since all your strings are empty by default they will just hold NULL pointers. So all in all you will use 24 bytes of stack memory, and 360 bytes of heap memory for that allocation.

Stefan_Salewski (orginal) [2022-02-26T11:49:28+01:00] view original

A sequence holds three things in a small stack-allocated object, the "capacity" which is the size of the buffer, "length" which is the current amount of elements in the buffer and a pointer to the "buffer" itself. This amounts to 24 bytes.

Sorry, I do not have you valid email address, and contacting you by a github issue may be something what you may not really like.

But I read this comment now for the second time from you, and it is obviously wrong:

proc main =
  var s: seq[int]
  echo s.sizeof

main()

This prints 8 for refc GC and 16 for --gc:arc. On 64 bit Linux OS. Indeed I was only sure that your 24 byte was wrong, I did not really remembered the 8/16 difference. I think I have to prove read that in my book.

PMunch (orginal) [2022-02-26T13:16:01+01:00] view original

Ah, it seems like I was slightly mistaken. I could've sworn that's how they worked, but I seem to have mixed refc and ARC. But you're right, on the default refc GC it almost follows the memory layout I mentioned, but it's only a pointer which is stored on the stack and the data in the struct is inlined. This means that it's 8 bytes of stack memory for the pointer and 1216 bytes of heap memory. I had a calculation error in my previous version as I only timed it by 5 and not by 50. And since strings here are only a pointer it changes the size of the array to 24 (one pointer per string, 8*3=24), then we have allocated room for 50 of those (50*24=1200), and the sequence is capacity, length, and inlined data so we have to add the size of two integers (8*2 + 1200 = 1216). So all in all 8 bytes of stack memory and 1216 bytes of heap memory (as long as all the strings are empty).

On ARC it stores an object with the length and the pointer on the stack (16 bytes), but the pointer now only points to an object with the capacity and the data inlined. This means our math gets a little different. The size of an empty string is now 16 bytes, so the array is 3*16=48 bytes per element, so with 50 elements it's 48*50=2400 bytes long. Then the capacity of that sequence is added for a total of 2408 bytes. So it's 16 bytes of stack memory and 2408 bytes of heap memory if my math checks out.

zevv (orginal) [2022-02-26T14:08:20+01:00] view original

Take a peek at https://zevv.nl/nim-memory/ for more details

haoliang (orginal) [2022-02-26T18:26:32+01:00] view original

thanks for awesome documents you have made, i'm deadly longing for the missing chapters:

A more elaborate discussion on garbage collection, and the available GC flavours in Nim.

Using Nim without a garbage collector / embedded systems with tight memory.

The new Nim runtime!

Memory usage in closures/iterators/async — locals do not always go on the stack.

FFI: Discussion and examples of passing data between C and Nim.

but it seems any of them can be answered briefly, such a long way to go.

mratsim (orginal) [2022-03-01T14:14:59+01:00] view original

A more elaborate discussion on garbage collection, and the available GC flavours in Nim.

Nim default memory management scheme is deferred reference counting. Only ref types and seq and string are managed by refcounting. Plain object are stack allocated, ptr object are manually managed like in C.

Deferred as in you pay the refcounting price only if the object escapes its allocating scope. (You do pay the heap allocation all time though).

There is a difference between the allocator and the memory reclamation scheme. Nim allocator is TLSF. http://www.gii.upv.es/tlsf/main/docs

TLSF is "Two Level Segregate Fit", an allocator designed for realtime system guaranteing O(1) allocation.

Note that in the deferred refcounting scheme, the allocation is thread-local which requires deep-copy when passing even ref object across threads.

There is an additional mark-and-sweep phase for types that may have cyclic references unless they are marked {.acyclic.}

Which brings use to the other memory management schemes.

pure refcounting without cycle-detection with --gc:rc (?)

Pure mark-and-sweep via --gc:markandsweep, which the stop-the-world GC.

Boehm GC via --gc:boehm, which is NOT thread-local, written by Hans Boehm and so was the previous recommentdation for multithreaded programs.

--gc:none and --gc:destructors which removes GCs altogether. You do have seq/strings via destructors in the latter. You cannot use ref types unless you're fine with leaks.

Using Nim without a garbage collector / embedded systems with tight memory.

Use --gc:arc --gc:destructors or --gc:none and only use plain object and ptr object so that no GC context is initialized at all.

The new Nim runtime!

It's gc:arc and gc:orc (arc + cycle detection for types that need it, orc cycle detection is not mark-and-sweep-based but coloring based).

Memory usage in closures/iterators/async — locals do not always go on the stack.

They never go on the stack, you need something called "heap-allocation elision" which doesn't exist in a stable manner in C++ (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0981r0.html).

Note: there is a difference between closure iterators which are resumable functions/coroutines and currently always heap allocated and inline iterators.

FFI: Discussion and examples of passing data between C and Nim.

That's very easy, use c2nim on a C header to pass data from Nim to C. And the other way around, it's straightforward, might need an extra NimMain in some cases but indeed miss some article to go into the details.

Araq (orginal) [2022-03-01T17:40:37+01:00] view original

Actually, there is no --gc:rc switch, that would be --gc:arc and --gc:destructors is some ill-specified, obsolete prototype of --gc:arc.

Mirror of forum.nim-lang.org

8957 :: Comprehensive knowledge on Nim memory management (MM)