I have question which is likely addressed to Araq.
I have spent a couple of days investigating the reason why my Nim code is somewhat slower than similar C implementation.
I have traced it down to the fact that Nim invokes gcc with "-fno-strict-aliasing" flag. In my case it reduces level of vectorization/optimization done by gcc. I have recompiled generated C code without this flag and now my Nim code is on par with C.
The question if it is safe to remove this flag? and why it is there in the first place. If it is actually required, I am willing to help getting rid of it if it is feasible.
(target_type *)(void *) instead_of (target_type *)
It is not exactly what standard say, but it works on all C compilers I worked on. Would you accept such pull request with -fno-strict-aliasing removal included?
I read a bit about the strict aliasing rule, and it seems that the only code that would be broken by "-fno-strict-aliasing" is code that declares two pointers to the same block of memory, and simultaneously uses both pointers to access the memory. I'd say that most programs, especially Nim programs, do not do this.
I see two potential problem areas:
The first potential problem area would be the garbage collector and memory manager. If the garbage collector and memory manager do this, then they must be converted to casting through unions, as Araq suggests.
The second potential problem area would be how the Nim garbage collector generates code for ref's and inheritance. If a ref is only used as one type, then there is no problem. If a ref is used as a parent type and a child type, then there is a potential problem there, depending on how the Nim compiler generates the cast for the ref.
After these two potential problems areas are evaluated and fixed, then "-fno-strict-aliasing" can be removed, as "safe" Nim code would not be able to violate the strict aliasing rule. Nim code that uses "cast" and ptr's in certain ways would still be able to violate the rule, but I would argue that it would be the programmer's responsibility to pass "-fno-strict-aliasing" to the compiler in this case.
Also, in my opinion, violating the strict aliasing rule is a code smell, and so I think we should avoid accommodating bad code with the "-fno-strict-aliasing" flag.
All the major OS kernels (Linux, FreeBSD, and OpenBSD) have strict aliasing disabled, and for good reasons. So do a plethora of other C applications and libraries.
One, it is insanely easy to accidentally violate the strict aliasing rule when manually writing normal C code and there is no way to safely statically safeguard against such occurrences. It's an easy source of Heisenbugs with basically no performance-related payoff outside of rare circumstances (where aliasing can create a performance impact, the pointers usually point to entities of the same type).
This also goes for manually written C code that's being included in Nim directly, such as static inline functions from header files for external functions.
Two, even for generated code (such as code emitted by the Nim compiler), it is difficult to avoid undefined behavior in low-level code unless the code generator was designed from the ground up for this. There are a number of reasons for that:
proc main =
var a: int64 = 1
var p: ptr float = cast[ptr float](addr a)
var q: ptr int32 = cast[ptr int32](addr a)
p[] = 1.0
q[] = 2'i32
echo a
main()
This example would require a union of int64, int32, and float. Which may be impossible to construct if the code is spread over several functions.
I've even changed -fno-strict-aliasing to "-fstrict-aliasing -Wstrict-aliasing" in build.sh (in csources), and nim compiler compiled without any issues!
"Absence of evidence does not mean evidence of absence", as they say. The compiler may just not have triggered the relevant optimizations in this case, or you may not have executed the relevant code. You also have no guarantee that future versions of gcc/clang won't introduce breakage.
I did a little searching and compiling and I'd have to say Jehan is right here about "absence of evidence". See, this stackoverflow thread for a very simple example to people that is (still in 2017) too complex for (at least gcc's) Wstrict-aliasing heuristics. It sure seems like strict-aliasing is a real morass.
@cdome - perhaps a better solution would be some kind of emit and/or macro machinery that turns on -fstrict-aliasing just for the procs you need to recover performance. gcc has had this #pragma or __attribute__ way to do that for quite a few years now (2011, I think). See here.