As other and even Araq recently mentioned, int datatype is not always the fastest on 64 bit boxes, a factor of 2 in speed is possible indeed.
Is something like int32_fast as in C++ a bad idea?
From my understanding that is not the idea of the C++ people:
int_fast32_t: fastest signed integer type with width of at least 8, 16, 32 and 64 bits respectively
If I would use int32 in Nim, it may be fastest on my maschine. But there may exist architectures (future) where 64 bit operands are faster.
So when I need at least 32 bit, care for speed and do not care much for memory usage, I would select int_fast32_t in C++.
Where these assumptions don't hold, there's not likely to be such a thing as a "fastest" integer at all. You'll have a type that may be faster for some operations, and not so fast on others (to an extent, this can even be true on Intel architectures [1]). And when you get down to that level of detail, you're probably best off writing special cases for these architectures that take more into account than just integer size. The idea of portable low-level optimizations strikes me as a contradiction in terms.
For example, PPC processors lack instructions to work on partial words, so you will always do full integer operations in the ALU and may have to do extra conversion work if you operate with ints of multiple sizes (the PPC instruction set does allow you to load/store shorter words or bytes from/to memory directly). So, mixing int sizes can create overhead when working on registers, but it will also reduce memory bandwidth usage when working with integers stored in memory. I.e. there's no guarantee that one type of int is faster than the other in any given scenario.
So when I need at least 32 bit, care for speed and do not care much for memory usage, I would select int_fast32_t in C++.
Except that memory bandwidth is one of the reasons for 32-bit integers to be faster.
[1] Google "partial register stalls", for example.
I think there is (maybe) a problem with understanding the proposal:
"int_fast32' could use "int64" on systems where "int64" IS faster than "int32". Likewise "int_fast8" could be "int32" or even "int64" IF that is faster.
The reason why it is faster to me is a "blur" but there seems to be some reasoning behind that (see OP links).
That would mean that the compiler uses different int's depending on the systems architecture. Fine.
But as Jehan said: This probably means a lot of other things to consider. So there may be places where you use "fast" int's then again you use normal int's but only if you are on this or that system. This could go in an external module but probably is overkill for the standard library.
Thanks for your re-explanation...
I think the idea of fast-types is not that bad -- there are architectures where a 8 byte float operation is faster than a 4 byte one, because the CPU can do only 8 byte op, and have to convert operand from 4 byte type to 8 byte type and back to 4 byte type. Was that on Amiga? Can not really remember. And there are micro-controllers, where one bytes types are slower than larger ones, because internally the register operates on larger data.
But I see the problem: A eight byte operation may be faster in the CPU, but 8 byte types may occupy more cache, so it is not really clear what type is fastest.