nimforum mirror - cast vs type conversion and saveCast

Stefan_Salewski (orginal) [2016-01-24T16:16:56+01:00] view original

We have the cheap but unsafe casts, and the safe type conversion, for which we may not always know the costs. So may we like to have something like a safe cast operation, which is cheap and safe, but complains when it is not safe? It would be the same like a type conversion which ensures that it has no costs.

type
  Euro = distinct int
  Color = 0..1

var
  i: int
  f: float
  e: Euro
  c: Color

# generally nonsense
f = cast[float](i)
i = cast[int](f)

# expensive conversions
f = i.float
i = f.int

# cheap conversion, we should get that for free
e = i.Euro
i = e.int

# and this? May be cheap if size is equal
c = i.Color
i = c.int

# so we may like to have
i = safeCast[Color](c)

#That would complain if a safe cast does not work, so we have to use a conversion.

Araq (orginal) [2016-01-24T18:30:30+01:00] view original

The last thing Nim needs is yet another cast-like operation, people confuse 'cast' and type conversions all the time already. Plus every allowed bultin type conversion is O(1) and rather cheap.

Stefan_Salewski (orginal) [2016-01-24T19:35:36+01:00] view original

people confuse 'cast' and type conversions all the time already.

The reason is that the difference is not often explained well. I can not really remember a very good explanation in the Nim docs -- maybe I forgot it or it was not there 18 months ago when I read it. And I have some books about C and other languages which explains the difference not well also. But we know the difference.

Nim does not allow mathematical operations between float and int without conversion-- it was my impression that performance was one reason?

I generally avoid casts and conversions, but for example when we have a distinct int and do arithmetic, we may have to write a.int + b.int. I think that is free of additional costs.

But I think my idea is not too wrong, for this code piece

proc main =
 var
   x: float
   j: int
 x = 0
 j = 0
 for i in 0..10000000:
   j += (i.float + 3.1).int
   #j += i + 3
 
 echo j
 
 main()

it is three times faster when I replace the float expression in the loop with the commented out int expression. Of course that is not a good comparison, the float add may be slower as well.

From http://stackoverflow.com/questions/12920700/floating-point-conversions-and-performance we get that such conversions are not very costly.

Jehan (orginal) [2016-01-24T19:57:54+01:00] view original

Stefan_Salewski: Of course that is not a good comparison, the float add may be slower as well.

What did you benchmark this with? With -d:release, they are both instantaneous for me (because the compiler optimizes most of the code completely away).

The following two pieces of code take identical time for me with -d:release:

proc main =
  for i in 1..1000000000:
    var x {.volatile.} = i

main()

proc main =
  for i in 1..1000000000:
    var x {.volatile.} = i.float

main()

Granted, this probably also has to do with the FP unit not having anything else to do on a superscalar processor. And in the following example, the version with conversions from float is actually faster (again, probably an artifact of superscalar execution being able to do more in parallel):

proc main =
  var x {.volatile.}: int
  for i in 1..1000000000:
    x += i
  echo x

main()

proc main =
  var x {.volatile.}: float
  for i in 1..1000000000:
    x += i.float
  echo x

main()

(And, yes, the float version overflows.)

Stefan_Salewski (orginal) [2016-01-24T20:13:12+01:00] view original

Nim 0.13 with -d:release and gcc 5.3.

But my box is an older AMD64, maybe float addition is much slower, I do not know. I tested multiple times

Yes. it is difficult to test, I tried a loop that is not fully removed by gcc.

stefan@AMD64X2 ~/nimtoychess $ time ./x
50000035000003

real	0m0.003s
user	0m0.000s
sys	0m0.002s

stefan@AMD64X2 ~/nimtoychess $ time ./x
100000070000006

real	0m0.071s
user	0m0.069s
sys	0m0.002s

Stefan_Salewski (orginal) [2016-01-24T20:20:55+01:00] view original

the version with conversions from float is actually faster

That is really interesting. I know that float is comparable fast as int on modern hardware.

I can remember you told someone to avoid usage of float for code where int can be used. I generally do that.

[EDIT]

http://forum.nim-lang.org/t/533/2

You wrote: A second is that using floating point operations unnecessarily and extensively hurts hyperthreading opportunities

[EDIT2]

Indeed, you are right: for int the loop is completely removed by gcc, loop iterartions does not matter :-(

cblake (orginal) [2016-01-24T23:59:53+01:00] view original

Besides the hyperthreading resource issues Jehan mentioned in your link, there is also a possible context switch boost if your process/thread literally never uses FP. The OS can avoid saving/restoring most of the register state in that case, and this can speed up each and every context switch (back in the day an improvement on the order of 3X, obviously very CPU-dependent). In these days of SSE/AVX-optimized string operations, it has become more rare for even purely non-numerical programs to never touch those registers, though. And, of course, only context switch-heavy workloads benefit. This is also a kind of "subtle background cost" that only a careful benchmark might reveal.

It's just yet another possible origin for the advice of "use ints not floats if you can". Another much older reason goes back to "not all CPUs even have FPUs". :-)

Advice and guidelines are just that, though - almost never a substitute for benchmarks/timing, and mileage can vary for so very many reasons - compilers, optimization flags, CPUs, etc.

jibal (orginal) [2016-01-25T01:31:13+01:00] view original

#That would complain if a safe cast does not work, so we have to use a conversion.

I don't see the point of using safeCast rather than a conversion. If the conversion is cheap then great. If it's not, you had to pay the cost anyway. Why write it with safeCast, get a compile-time error (which might be conditional on how the types are defined), and then rewrite it with a conversion, rather than just write it with a conversion in the first place?

If the goal is to find out about expensive code that might be rewritten to be cheaper, a language feature like safeCast isn't the way to do that. Rather, develop static analysis tools that warn of expensive operations or, better, do profiling ... rewriting something just for the sake of performance should only be done when an actual significant performance issue has been identified. As they say, Premature optimization is the root of all evil.

Stefan_Salewski (orginal) [2016-01-25T11:21:18+01:00] view original

If the goal is to find out about expensive code that might be rewritten to be cheaper,

Yes, that was the core of my idea. Sometimes I do not know costs of a conversion. OK, the cost may be very small for all of allowed conversions in Nim, as Araq said. In the microcontroller area cost are more important, for example one should avoid mixed arithmetic operations with signed and unsigned datatypes or operants of different byte size. When the compiler does conversition silent, or you use conversions explicitly assuming zero cost you may loose performance.

Jehan (orginal) [2016-01-25T14:02:38+01:00] view original

Stefan_Salewski: I can remember you told someone to avoid usage of float for code where int can be used.

This was about (1) unnecessary use of float operations (2) in a library.

An application can make more assumptions than a library can. If an application is using integer-only code with the goal to exploit hyperthreading opportunities, then a library that unnecessarily use floating point operations can hurt that; if an application author knows that their program is only single-threaded, then they want to use their CPU's resources as efficiently as possible. Applications can have that knowledge; libraries generally don't.

This, again, can change in libraries that perfom huge chunks of computations by themselves (such as libraries for scientific computing). As always, when it comes to optimization, things are rarely black and white.

Note also that even if you're intentionally offloading integer computations to an FPU, you're creating technical debt. The code may become more involved than necessary, it may not be possible to use vectorization techniques if you mix and match int and float computations, your code's behavior with respect to overflow changes, and so forth. This is generally why you do such optimizations only if you know that you need them and if you are aware of the trade-offs. (Knuth's "premature optimization is the root of all evil" quote is really about the minimization of technical debt once you get down to it.)

jibal (orginal) [2016-01-27T10:39:21+01:00] view original

Knuth's "premature optimization is the root of all evil" quote is really about the minimization of technical debt once you get down to it.

That's a good observation as far as the "evil" part goes, but it's also about misplaced effort -- YAGNI.

Mirror of forum.nim-lang.org

1969 :: cast vs type conversion and saveCast