nimforum mirror - Native compile option?

Polarian (orginal) [2022-08-13T02:49:30+02:00] view original

Hello,

I have been looking at the documentation for the nim compiler, however something I could not find documented anywhere was whether nim supports native compiling to a specific cpu micro-architecture.

Nim supports cross compiling to generic architectures, for example arm, x86_64 etc. It also supports cross platform compiling, windows, linux etc.

However I could not find anything about compiling to the native cpu, what I mean by this is that every cpu does not have the same instructions, most notably Intel vs AMD cpus, both have their own unique instructions which allow for specific optimisations for the cpu.

If you wanted to do this using the gcc compiler (GNU C compiler), you can use the CFLAGS environment variable and set it to:

CFLAGS="march=native -O3" (Auto detect the cpu which is being used and enable the corresponding cpu flags it supports, and also use level 3 optimization, which is the highest level)

Does nim have a implementation of this feature, as it would allow further optimizations to nim code, it is a well known fact that specific cpu flags can actually reduce performance in some situations, but in others it can vastly increase performance, aka its hit or miss sometimes. One notible thing I have found is the vectorization of loops by gcc can sometimes severely bottleneck the code, it attempts to optimize the code to the point it actually makes it slower in some cases.

Does nim have any support for these features? if not is it planned to be added?

Thank you, Polarian

Yardanico (orginal) [2022-08-13T03:22:57+02:00] view original

Since Nim uses C as it's intermediate representation so that it can utilize existing C compilers, you can just do the same as you do it with GCC:

nim c -d:release --passC:"-march=native"

(-d:release already implies -O3 being passed to the C compiler, and for even more optimization there's -d:danger which disables all runtime safety checks, so it's only suitable in some cases)

amadan (orginal) [2022-08-13T03:25:34+02:00] view original

You can use --passC for that which passing arguments to the c compiler e.g. nim c --passC="-march=native -O3"

Polarian (orginal) [2022-08-13T05:12:39+02:00] view original

Maybe it might be a good idea for CFLAGS env variable to be used by default, when testing for some reason gcc was not picking up the CFLAGS when I used nim but it was when compiling a C file... any reason for this?

Polarian (orginal) [2022-08-13T05:15:37+02:00] view original

Sometimes -O3 may not be the best option though. O2 is the most common used flag because it provides a balance between compile times and also efficiency.

Also O3 can in some cases be slower, like I said with vectorization, sometimes trying to optimise the code actually causes deoptimization. Meaning not all codebases can be compiled with O3 (well they can, but not recommended).

For instance the linux kernel is only adviced to be compiled with O2 as O3 can break codebases as it permits the optimization to remove code which it deems inefficient or redundant, which can cause some code to break, wouldn't want something like the AMDGPU module to be handicapped by the compiler would you?

Polarian (orginal) [2022-08-13T05:16:17+02:00] view original

Also I forgot to say thank you to you both, so thank you :)

mratsim (orginal) [2022-08-14T11:33:38+02:00] view original

Probably the makefile you used starts with $CFLAGS

Polarian (orginal) [2022-08-14T20:42:59+02:00] view original

Does nim use its own makefile configuration? or does it use the default configuration for make?

mratsim (orginal) [2022-08-14T23:25:22+02:00] view original

Nim doesn't use makefiles

You can read env variable with os.existsEnv"CFLAGS" and os.getEnv"CFLAGS" (it should work in the .nimble file as well) or influence compilation --passC:"${CFLAGS}".

Regarding CPU specific features, Nim offers plenty of possibility:

localPassC to pass flags to specific Nim files, example:
- https://github.com/mratsim/constantine/blob/9770b31/constantine/math/arithmetic/assembly/limbs_asm_mul_x86_adx_bmi2.nim#L29
- https://github.com/mratsim/weave/blob/71dc2d70/benchmarks/matmul_gemm_blas/gemm_pure_nim/common/gemm_ukernel_avx512.nim#L6

You can also set which files compile with which flags in a nim.cfg file:

https://github.com/numforge/laser/blob/e23b5d6/nim.cfg#L24-L35

or a config.nims file:

https://github.com/status-im/nimbus-eth2/blob/73b5798/config.nims#L182-L205

This can also parametrize pure C or pure C++ files you compile alongside Nim.

And to go further you can combine this with compile-time parametrization and CPU autodetection using gorge if the binary is used on the same machine:

https://github.com/status-im/nim-blscurve/blob/b200d975/blscurve/bls_backend.nim#L34-L41

or easily control per OS flags:

https://github.com/status-im/nimbus-eth2/blob/73b5798/config.nims#L81-L105

or create per-backend, for example GPU, switches:

https://github.com/mratsim/Arraymancer/blob/cf5e770/arraymancer.nimble#L71-L77

And lastly, you can replace Make or CMake to compile C projects or C++ projects via Nim:

replacing make: https://github.com/inv2004/nim-decimal/blob/9a3c915/decimal/decimal_lowlevel.nim#L35-L74 (see Makefile here; https://github.com/status-im/nim-decimal/tree/master/decimal/mpdecimal_wrapper/mpdecimal_sources)

replacing CMake:
- https://github.com/numforge/agent-smith/blob/a2d9251/third_party/ale_build.nim#L40-L83 (see the CMake installer here: https://github.com/mgbellemare/Arcade-Learning-Environment)
- https://github.com/SciNim/flambeau/blob/27e39c5/flambeau/install/torchvision_build.nim#L68-L100 (CPP files at https://github.com/pytorch/vision/tree/main/torchvision/csrc and build at https://github.com/pytorch/vision)

In summary, I would be very surprised if what you need to do isn't possible with Nim, even on a per-file basis, even if you mix Nim, C and C++ code.

Mirror of forum.nim-lang.org

9366 :: Native compile option?