Hello,
I have been looking at the documentation for the nim compiler, however something I could not find documented anywhere was whether nim supports native compiling to a specific cpu micro-architecture.
Nim supports cross compiling to generic architectures, for example arm, x86_64 etc. It also supports cross platform compiling, windows, linux etc.
However I could not find anything about compiling to the native cpu, what I mean by this is that every cpu does not have the same instructions, most notably Intel vs AMD cpus, both have their own unique instructions which allow for specific optimisations for the cpu.
If you wanted to do this using the gcc compiler (GNU C compiler), you can use the CFLAGS environment variable and set it to:
CFLAGS="march=native -O3" (Auto detect the cpu which is being used and enable the corresponding cpu flags it supports, and also use level 3 optimization, which is the highest level)
Does nim have a implementation of this feature, as it would allow further optimizations to nim code, it is a well known fact that specific cpu flags can actually reduce performance in some situations, but in others it can vastly increase performance, aka its hit or miss sometimes. One notible thing I have found is the vectorization of loops by gcc can sometimes severely bottleneck the code, it attempts to optimize the code to the point it actually makes it slower in some cases.
Does nim have any support for these features? if not is it planned to be added?
Thank you, Polarian
Since Nim uses C as it's intermediate representation so that it can utilize existing C compilers, you can just do the same as you do it with GCC:
nim c -d:release --passC:"-march=native"
(-d:release already implies -O3 being passed to the C compiler, and for even more optimization there's -d:danger which disables all runtime safety checks, so it's only suitable in some cases)
Sometimes -O3 may not be the best option though. O2 is the most common used flag because it provides a balance between compile times and also efficiency.
Also O3 can in some cases be slower, like I said with vectorization, sometimes trying to optimise the code actually causes deoptimization. Meaning not all codebases can be compiled with O3 (well they can, but not recommended).
For instance the linux kernel is only adviced to be compiled with O2 as O3 can break codebases as it permits the optimization to remove code which it deems inefficient or redundant, which can cause some code to break, wouldn't want something like the AMDGPU module to be handicapped by the compiler would you?
Nim doesn't use makefiles
You can read env variable with os.existsEnv"CFLAGS" and os.getEnv"CFLAGS" (it should work in the .nimble file as well) or influence compilation --passC:"${CFLAGS}".
Regarding CPU specific features, Nim offers plenty of possibility:
You can also set which files compile with which flags in a nim.cfg file:
or a config.nims file:
This can also parametrize pure C or pure C++ files you compile alongside Nim.
And to go further you can combine this with compile-time parametrization and CPU autodetection using gorge if the binary is used on the same machine:
or easily control per OS flags:
or create per-backend, for example GPU, switches:
And lastly, you can replace Make or CMake to compile C projects or C++ projects via Nim:
In summary, I would be very surprised if what you need to do isn't possible with Nim, even on a per-file basis, even if you mix Nim, C and C++ code.