I've translated the Euler problem #21 solution (C and Rust versions lifted off the net) to Nim and benchmarked the binary.
Time to solution for C/Nim/Rust on my Cortex A5 cpu:
1.5s/1.5s/2.5s
Two questions about compiling the generated C immediately spring to mind.
Does the nim compiler use $CFLAGS in release mode, if present, and is there a switch for performing a PGO build automatically?
C: http://pastebin.com/X3ZTp1c0
Rust: http://pastebin.com/u2TixCiV
C compilers don't expand shell variables so the short answer to my questions seems to be no and no, use --passC
It would be good for usability if the nim compiler offered automated profiled builds as well as passed the expanded CFLAGS via --passC automatically in release mode.
petevine: It would be good for usability if the nim compiler offered automated profiled builds as well as pass the expanded CFLAGS via --passC automatically in release mode.
You can do this via a <project>.nims config file, e.g.:
import strutils, sequtils
let cflags = getenv("NIMCFLAGS").split(" ").filter(
proc(x: string): bool = len(x) != 0)
for cflag in cflags:
switch("passC", cflag)
No problem. Anyway it was Rust being slower but that's a throwback to an ARM specific issue that got fixed in LLVM a long time ago here:
http://llvm.org/viewvc/llvm-project?view=revision&revision=259657
<project>.nims config file
Thanks @Jehan, I've just finished reading up on NimScript
I eventually settled for plain aliases:
alias nim-gen='nim c -r -d:release --passC:-mcpu=cortex-a5 --passC:-mfpu=neon --passC:-ftree-vectorize --passC:-fprofile-generate --passL:-lgcov'
alias nim-use='nim c -d:release --passC:-mcpu=cortex-a5 --passC:-mfpu=neon --passC:-ftree-vectorize --passC:-fprofile-use'
so that building a profiled binary is as simple as, e.g.:
$ nim-gen matmul.nim 1500 && nim-use matmul.nim
I experimented a bit with benchmarking this code (see link for details).
Some observations: