You may have already seen
I did not really read it, but his comparison of of C and Nim fibonacci is surprising.
My first thought was: Can that fool have compiled the Nim code in debug mode?
No, nobody can be that stupid, and the compilers tells us when we compile in debugging mode. But obviously that guy did not ask his self what reason can exist that Nim is a factor 5 slower than C. That question is so obvious. So maybe he is indeed a fool?
And indeed, he did it:
Independent of whether the article is good or bad, we don't need to call the author stupid or a fool.
Also, we should be happy people outside the main community take note of Nim and write articles about it. If they get something wrong, we should educate and not insult them.
Especially given that as far as I can see, we as a community barely write any articles at all!
we don't need to call the author stupid or a fool.
Sometimes I try to make a friendly face.
But when I see a fool I call him a fool, and when I see garbage I call it garbage.
The author has been told about his debug mode issue 4 days ago. And did he care? Not at all. Writing about something when you have not a basic idea about the stuff is not a really good idea. For everyone reading his text it is only wasting of time. There is absolutely no value in the content. "Looks like Python". "Uses let and var keywords". Have you seen his macro example? Useless.
What is the result for the reader: Well there is a tiny language called Nim, which has a syntax similar to Python, is a bit faster than Python but still a factor of 5 slower than plain C.
Well his conclusion that Nim has a tiny community is correct, but that is no news.
Indeed that article is also a very bad promotion for Nim.
FWIW, I copy-pasted his code, ran it with PGO and the Nim was actually 1.5x faster than the C...I believe this is due to a quirky/sensitive gcc optimization that the very specific layout of the Nim C code allows that the most natural C code does not.
Recursive Fibonacci was only ever useful as a funcall overhead benchmark. This optimizer aspect makes it not even useful for that anymore since the number of recursive calls can change. But this knowledge has not propagated. Maybe it needs more specific elaboration. I'd bet sure guys on the gcc team "just know", though.
The topic has come up many times - someone from a non-compiled world comes to Nim and doesn't know about default-debug/activate optimization style of compiled languages...Or, in this case, seemingly knows about it for C, but not Nim. It's hard to say what to do about that.
See also https://forum.nim-lang.org/t/4253 / https://github.com/drujensen/fib where Nim C++ was 8 times faster than C, Nim C or C++.
Single-threaded Fibonacci is a benchmark of tail-call optimizations and tail-call folding.
Multithreaded Fibonacci is a benchmark for runtime overhead.
For the curious it matters quite a bit to do floating point arithmetic because a "big" optimization in question relates to FP vectorization. There are more details in the @xyz32, @aedt @cblake and @oyster comments in https://forum.nim-lang.org/t/1779 (which is otherwise mostly about D vs Nim). At least 3 other people reproduced that large 5+x performance delta from the optimization/vectorization/fewer function calls trick..So, it's probably not too hyper sensitive to gcc optimization flags, but I do think PGO builds cause this particular optimization to be missed.
And...UPDATE - Years later, the vectorization/call elimination optimization still works on gcc-10.1 and still requires the indirect call structure to work (for both my reduced C and, implicitly the Nim program at the top of the thread). clang-10, AFAIK still cannot do the optimization..(At least clang -Ofast on that reduced C program remains 10x slower than gcc.)
In my test just now the Nim version took 2x the time of the specialized-to-activate-the-optimization-C (best guess float32 in the C vs float64 in the Nim changes vectorization stride), but both are still these 5..10x better than other compilers just because they do so much less work (because of the exponential sensitivity of Fibonacci work). Why, maybe someone with avx512 can get gcc to do some 16x or 32x way thing going that does 36x or 200x less work or something! If you don't like reading x64 assembly, you can add the gcc -pg type profiling and use gprof to confirm how many fewer funcalls happen.
It probably is hard-to-near impossible to propagate just how fraught with apples-to-oranges peril the Fibonacci benchmark has become (especially to people like that article comparing debug to optimized builds).
Ok, now is not too complicated to add:
Warning: Compiling on Debug mode, compiled result is slow, compile with -d:danger for performance.
It may even be more useful than the current Warning: Observable stores. (?)
:P
Thats true, is too complicated to make it red color and green when is release/danger ?, is just "Discoverability" of the feature.
I am joking about Observable stores, I wonder how that will evolve.
:)
If someone isn't aware that there are compiler options to influence the runtime, they may not understand the implications of "debug build" vs. "release build" either. ("I don't want to release this, so the 'debug build' should be fine." :-) )
That said, the message that a debug build is generated may help if someone just forgot to use the -d:release or -d:danger option.