You can compile with --gc:arc and watch its memory consumption. You can also override malloc to give you allocation counts.
Or compile with -d:nimAllocStats and use
dumpAllocstats:
main()
Or use valgrind and -d:useMalloc. So many options already...
Does -d:nimAllocStats has overhead ?, why is not always on ?, I used it successfully just wondering...
Overhead should be minimal but the API is not stable. :-)