nimforum mirror - newSeq and memory allocation

ktamp (orginal) [2015-03-07T19:32:35+01:00] view original

I've written the following code in order to test memory allocation of sequences:

proc wait(message: string) =
  write(stdout, message)
  discard readLine(stdin)

proc allocate(n: int) =
  wait("Press enter to allocate memory...")
  var x = newSeq[uint8](n)
  wait("Press enter to free memory...")

allocate(50_000_000)
allocate(50_000_000)
wait("Press enter to exit...")

If --deadCodeElim:on is used when creating the executable, sequence x is never allocated.

If --deadCodeElim:off is used, sequence x is allocated but never freed, and the two calls of allocate() end up eating 100 MB of memory.

Adding x = nil or x = @[] or even GCunref(x) at the end of allocate() doesn't solve this issue.

Shouldn't sequence memory be garbage collected?

I'm open to suggestions...

OderWat (orginal) [2015-03-07T19:57:24+01:00] view original

Just a thought: GC does not mean that memory is deallocated if not used anymore. But in this case I would expect that only 50 MB are used total.

def (orginal) [2015-03-07T20:03:55+01:00] view original

Try a third allocation and the GC starts working (on my system at least).

ktamp (orginal) [2015-03-07T20:14:43+01:00] view original

I just tried that. The third allocation dropped memory usage down to 164 KB (under 64-bit Linux) just as if --deadCodeElim:on had been used.

Then a forth allocation raised memory usage to 50 MB.

Maybe the GC works only after a certain number of cycles?

Jehan (orginal) [2015-03-07T20:39:03+01:00] view original

First, I don't see a difference between --deadCodeElim:on and --deadCodeElim:off on my system.

Second, the GC kicks in the first time after >4 MB of memory are allocated or the zero count table has 500 entries. What you're experiencing is probably old references being kept alive because the stack is being scanned conservatively.

ktamp (orginal) [2015-03-07T21:09:36+01:00] view original

Indeed there is no difference between --deadCodeElim:on and --deadCodeElim:off when not using -d:release. In this case memory usage switches between 50 MB and 100 MB with each allocation / deallocation.

When using -d:release you experience the behavior I've mentioned only with --deadCodeElim:off. Otherwise the sequence is never allocated at all, since it is never really being used.

Jehan (orginal) [2015-03-07T22:23:39+01:00] view original

I don't see a difference with -d:release on my machine, either, and would be surprised if there were. Dead code elimination does not eliminate the allocation and if it did, it would be a bug. Dead code elimination only removes unreachable functions, and everything in the code is very much reachable. In fact, looking at the generated code, the call to newSeq is very much there.

ktamp (orginal) [2015-03-08T22:21:08+01:00] view original

I checked on another PC today (again 64-bit Linux + GCC 4.9.2) and found the exact same behavior.

Then I tried on a third PC running 32-bit Windows XP + GCC 4.8.1 with -d:release. In this case the first memory allocation got 50 MB and from the second memory allocation on, memory usage was constantly 100 MB.

So maybe these findings relate to the version of GCC being used.

I also checked how malloc() and calloc() of GCC 4.9.2 perform for multiple allocations and deallocations of memory:

Using malloc() + manually erasing the allocated memory worked as expected.

When calloc() was used, only the first call to it allocated memory. After this memory was freed, subsequent calls to calloc() did not allocate memory. Neither physical memory nor from the swap partition, unless virtual memory can somehow be allocated without being shown anywhere...

Jehan (orginal) [2015-03-08T23:24:23+01:00] view original

How are you checking memory usage? GC_getstatistics() or getOccupiedMem() ? And when do you check it?

The behavior of malloc() and calloc() should be irrelevant, since Nim uses mmap() and munmap() on POSIX systems to obtain memory from the system (and virtualAlloc()/virtualFree() on Windows).

Note that because of conservative stack scanning, you can get all kinds of different behavior about when exactly a chunk of memory is being freed, so it's totally possible to have all or none of the big seqs being freed. The only thing I question is your claim that with --deadCodeElim:On memory gets never allocated (because the allocation call simply does not get optimized away, so that's not possible). What you're more likely seeing is that a small allocation elsewhere (possibly as part of the readLine call) triggers a GC and frees the allocated memory. Note that for large allocations, the Nim GC can return the memory to the OS via munmap() or virtualFree() when they're freed.

GravityWell (orginal) [2015-03-09T00:18:52+01:00] view original

...Dead code elimination only removes unreachable functions...

Not necessarily. If Nim is passing a switch to GCC/VCC, then dead code detection is not just unreachable functions, but also code that is determined as irrelevant, and the var x = newSequint8 statement could certainly meet that criteria, since it doesn't contribute to a function return or output.

http://en.wikipedia.org/wiki/Dead_code_elimination

Jehan (orginal) [2015-03-09T01:29:11+01:00] view original

GravityWell: Not necessarily. If Nim is passing a switch to GCC/VCC, then dead code detection is not just unreachable functions, but also code that is determined as irrelevant, and the var x = newSeq[uint8](n) statement could certainly meet that criteria, since it doesn't contribute to a function return or output.

Let me be precise. First, when I said that "dead code elimination only removes unreachable functions", I was describing the effect of the the compiler switch --deadCodeElim:on (technically, it also removes unneeded variables, but that's not relevant here).

Second, gcc/vcc/clang cannot optimize this away, either. Memory allocation is not a pure function; in fact, most of what it does is cause side effects to the allocator's internal structures (and also a system call to mmap()). That the result isn't returned to the original caller is absolutely irrelevant. No C compiler is allowed to optimize this away without correctly inferring that these side effects can be omitted in their entirely because the next GC cycle will remove it (and even that is questionable, because you're also removing a system call that changes the memory map of the process). This is pretty much impossible for even the best state of the art compilers; they would have to conjecture and prove not only that these side effects are eventually reverted, but also that intermittent allocations (which change the GC's data structures in non-trivial ways) essentially do not affect that property. It would then have to transform the program in such a way as to satisfy (e.g.) §5.1.2.3 of the C99 standard, which is essentially impossible (given that there are any intermittent allocations that do not 100% commute with the original allocation).

ktamp (orginal) [2015-03-09T21:08:07+01:00] view original

Ah, mmap()! I see now... VM memory is allocated but no memory usage should be visible in the system until the relevant pages are accessed.

Of course both getOccupiedMem() and GC_getStatistics() show this memory as being allocated.

Then, when --deadCodeElim:off is used, some quirk causes system memory to be visibly allocated at times.

Thank you very much, Jehan, for clarifying things to me.

Mirror of forum.nim-lang.org

992 :: newSeq and memory allocation