nimforum mirror - GC parameters

mora (orginal) [2015-09-18T08:59:04+02:00] view original

Hi,

Is there a way to influence GC's behavior? I can use --gc: to change the algorithm. Is it possible to play with the parameters?

Motivation: I have two almost identical files with different running time. (Please ignore any +/-10% differences here, these are one time runs, not averages.) https://github.com/petermora/nimMapBenchmarks/blob/master/test5.nim

nim c test5.nim; time ./test5
...
real    0m3.052s
user    0m1.380s
sys     0m1.677s  -> roughly 1.5 sec

https://github.com/petermora/nimMapBenchmarks/blob/master/test7.nim

nim c test7.nim; time ./test7
...
real    0m1.615s
user    0m1.107s
sys     0m0.507s  -> rougly 0.5 sec

Am I assuming correctly that the sys part is measuring the memory management part (since in my examples there are no file IO, just an echo)?

Just for comparison (please don't get me wrong, I'm completely happy with GC, just try to understand it), the same Rust program gives: https://github.com/petermora/nimMapBenchmarks/blob/master/test.rs

real    0m1.724s
user    0m1.717s
sys     0m0.007s

Thank you, Peter

Jehan (orginal) [2015-09-18T09:51:34+02:00] view original

The sys part would actually be time spent in the kernel and has nothing to do with the GC (other than anything that incidentally happens as the result of mmap()s and page faults). On my system (OS X) I cannot see any measurable difference between the sys part for both versions.

Edit: Actually, I stand (partly) corrected. There is (I think) some overhead returning the pages in system.freeOsChunks(). I'll have to investigate that more; at first I thought it was an end-of-process thing, but now I'm wondering if it may occur mid-GC.

Edit 2: This looks like a micro-benchmark artifact because you use basically no memory, but allocate a lot of throw-away seqs.It may be a useful idea still to have an option to not return pages to the OS; Nim does this to minimize its virtual memory footprint, but often that is an unnecessary optimization that can backfire.

mora (orginal) [2015-09-18T10:33:41+02:00] view original

@Jehan: Thank you for taking a look. I agree, this is a micro-benchmark. We probably want to free the memory immediately on a router, and want to be a little bit lazy with the page returning on a PC. I guess that there is no perfect setup for all cases. That's why I'm asking whether there is a way to control/suggest the behavior.

Thanks, Peter

Jehan (orginal) [2015-09-18T10:46:01+02:00] view original

It would actually be pretty easy to introduce an option (if you don't mind a temporary hack, you can experiment with making weirdUnmap true in lib/system/alloc.nim, which disables unmapping by the allocator). It'll depend on Araq whether he'd be fine with a PR; if so, I could whip up something pretty fast.

mora (orginal) [2015-09-18T11:05:20+02:00] view original

Thanks, I'll check weirdUnmap at home tonight.

mora (orginal) [2015-09-18T23:49:40+02:00] view original

So the good news is: with weirdUnmap = true most of the versions runs in 1.0 sec or 1.2 sec, beating Rust's 1.7 sec.

I could also achieve the exact same speed up if I left weirdUnmap as it was, and increased ChunkOsReturn constant from 1 MB to 8 MB. If I understand the role of this constant correctly, then an unused chank is returned to the OS if it is bigger than this size. My benchmark is very special because of the small allocations. However, a webserver could have similar characteristics (serving small files, converting data to json, etc), and webserver benchmarks are popular these days.

Do you think that somehow increasing this ChankOsReturn parameter would be reasonable? Could we have a smart (and obviously quick) logic which realize the recent allocation pattern and increases it automatically if needed?

Update: in this special benchmark with ChunkOsReturn = 8 MB there is no page return to the OS, if ChunkOsReturn = 1 MB then there are exactly 6000 page returns.

Thanks, Peter

Jehan (orginal) [2015-09-19T00:26:25+02:00] view original

For what it's worth, I put a PR together here to control this via a compile-time option.

Mirror of forum.nim-lang.org

1646 :: GC parameters