nimforum mirror - Howto simulate C macro in Nim?

Stefan_Salewski (orginal) [2018-04-27T13:35:54+02:00] view original

From www.godbolt.org we have with -O3:


#define eqabs(a, b) a*a == b*b

int t(int x) {
    return eqabs(x, 9) or eqabs(x, 81);
}

t(int):
  imul edi, edi
  cmp edi, 81
  sete al
  cmp edi, 25
  sete dl
  or eax, edx
  movzx eax, al
  ret

That seems to be fastest possible code with only one mul op. (I am still not sure if that is really faster than abs().)

For Nim, I think that inline procs do not help, so I would need a template. But how can I ensure that the integer literal parameter is evaluated at compile time and that "common subexpression elimination" works to get also best possible code? Would untyped template parameters suffice?

Have not tried to look at Nim's assembly code yet, as finding the template instructions in assembly listing is some work.

def (orginal) [2018-04-27T14:26:26+02:00] view original

template eqabs(a, b): bool = a * a == b * b

proc t(x: int32): bool = eqabs(x, 5) or eqabs(x, 9)

echo t(10)

nim -d:release c x. Assembly with gcc -S -fverbose-asm shows:


# /home/d067158/nimcache/x.c:50: 	T1_ = ((NI32)(x * x) == ((NI32) 25));
        imull	%edi, %edi	# x, _1
        movl	$1, %eax	#, <retval>
# /home/d067158/nimcache/x.c:51: 	if (T1_) goto LA2_;
        cmpl	$25, %edi	#, _1
        je	.L3	#,
# /home/d067158/nimcache/x.c:52: 	T1_ = ((NI32)(x * x) == ((NI32) 81));
        cmpl	$81, %edi	#, _1
        sete	%al	#, <retval>
.L3:
# /home/d067158/nimcache/x.c:56: }
        ret

Which looks fine. So if you want to depend on compiler optimizations, check the assembly output. If you don't want that, do the optimization manually.

But why multiply the number (can overflow too) instead of this?

let s = x shr 31
(x xor s) - s

Reference for tricks like this: http://graphics.stanford.edu/~seander/bithacks.html#IntegerAbs

Stefan_Salewski (orginal) [2018-04-27T20:27:48+02:00] view original

I assume your template use untyped parameters as

template eqabs(a, b: untyped): bool = a * a == b * b

(Indeed I should have used 64 bit int for the C code.)

It is great that we get the same optimized assembly as in C (for a template, but not for an inline proc)

But indeed comparing the squares seems to give no real advantage -- Nim's and C's abs() is already fully optimized.


#include <stdint.h>
#include <stdlib.h>
#define eqabs(a, b) a*a == b*b
#define eq(a, b) llabs(a) == b

int8_t t1(int64_t x) {
    return eqabs(x, 9) or eqabs(x, 5);
}

int8_t t2(int64_t x) {
    return eq(x, 9) or eq(x, 5);
}

int64_t a1(int64_t x) {
    return (x < 0 ? -x : x);
}

int64_t a2(int64_t x) {
    return llabs(x);
}


t1(long):
  imul rdi, rdi
  cmp rdi, 81
  sete al
  cmp rdi, 25
  sete dl
  or eax, edx
  ret
t2(long):
  mov rax, rdi
  sar rax, 63
  xor rdi, rax
  sub rdi, rax
  sub rdi, 5
  test rdi, -5
  sete al
  ret
a1(long):
  mov rdx, rdi
  mov rax, rdi
  sar rdx, 63
  xor rax, rdx
  sub rax, rdx
  ret
a2(long):
  mov rdx, rdi
  mov rax, rdi
  sar rdx, 63
  xor rax, rdx
  sub rax, rdx
  ret

Function t1 has one instruction less, but that does not mean that it is faster than t2.

mashingan (orginal) [2018-04-28T10:59:43+02:00] view original

There's asm statement to define how the function behaves in assembly level.

Useful for people who works in very constrained resources.

Only using that asm statement you can exactly control how will it become in assembly, without asm statement, in the end the code will changed to C and will be optimized by the compiler.

There's also emit pragma , but as mentioned there, use with caution.

Mirror of forum.nim-lang.org

3784 :: Howto simulate C macro in Nim?