https://kostya.github.io/LangArena/
Nim is only about 80% slower, which is quite good.
I agree that this kind of benchmark is not really meaningful for practical cases, and reflects mainly the different levels of optimizations.
Having said that, this benchmark is slightly different from traditional ones. According to its github README, the different implementations are translated from an original version in Crystal, by AI tools. See https://github.com/kostya/LangArena?tab=readme-ov-file#origin--approach
In some sense, this benchmark is fairer than the traditional ones where different people write different implementations. Of course, there are still many factors which can affect the implementations in different languages.
We can see Crystal has higher score. Maybe because it is the original, human-made version. Zig got scores quite different from other systems programming lang maybe because it is new so not much resource the AI can exploit. The reason that Nim is slower may be also similar.
However, languages like C# and Java are also slower than C/C++/Rust. Given that these are popular languages, this might mean GC has indeed some impact on performance?
I have the impression that the recent version of Nim's GC is pretty close to C++/Rust, so the difference there is interesting.
translated from an original version in Crystal, by AI tools
That is ... terrible. Though maybe it works out if the AI is allowed to use a profiler...
Build commands:
nim c -d:release --out:target/bin_benchmarks_release src/benchmarks.nim
g++ -Ideps/ -Ideps/base64/include -Wl,-rpath,/opt/homebrew/opt/llvm/lib/c++ -L/opt/homebrew/opt/llvm/lib/c++ -L/opt/homebrew/lib -I/opt/homebrew/include/ -O3 -std=c++20 -Ideps/simdjson target/simdjson.o target/libbase64.o main.cpp -o ./target/bin_cpp_run -lgmp -lre2 -lpthread
gcc -Ideps/base64/include -Ideps/yyjson/src -I/usr/include/ -O2 -Ideps/cJSON -L/opt/homebrew/lib -I/opt/homebrew/include/ main.c -o target/bin_c_run -lgmp target/cJSON.o target/libbase64.o -lm -lpcre2-8 -lpthread target/yyjson.o
No -d:danger for nim, gcc is invoked with -O2 and g++ with -O3. What are we measuring again?
also methods used instead of procs.
This just proves that some patterns are workable but not the best.
Well, my point here is not the quality of the AI-translated code , which I believe is problematic.
The author of this benchmark claims that all code, except for the Crystal, are translated by AI with minimal tweaking.
We can see C/C++/Rust all get similar score for the runtime performance, with variation < 10%, and D/C#/Java are all like 50% slower. These results are quite consistent with my impression about these languages, so I would say this benchmark does give some information.
Another observation compilation speed and expressiveness, where Nim gets pretty good scores. For incremental compilation time, Go is the best, and Nim is roughly the same as Rust. However, if we look at both cold compilation and incremental compilation, then Go/Nim/Dart/Typescript are among the best.
For expressiveness, Nim's score is slightly lower than Scala, and both are much better than the others. This is also consistent with our impression for Nim, right?
Now back to the runtime performance, I would assume that Nim, if properly wrote, should have performance close to C/C++/Rust, and better than C#/D/Java. So the problem is AI does not know about Nim well, but why? This could be just because there are less resources for Nim which can be used for AI training. However, it is also possible that writing optimized Nim code is not simple, so AI doesn't do a good job here.
Let me give an example. Julia, as a language for scientific computing, can have performance near C++. However, for newcomers it is common that they get performance even worse than Python code. This is because Julia is a very flexible language, and you need to write it in a particular way to get the best performance. Thus Julia community has a webpage in their doc, called "Performance Tips."
I am not sure what is the real cause, but I think the results of this benchmark is worth thinking.
Good to know that.
And I just noticed that it seems the author of this benchmark does not specify which AI tool they use. I have opened an issue on their repo to ask about this.
I got the reply from the author of the benchmark. https://github.com/kostya/LangArena/issues/7 He or she said the AI tool is a secrete, and can assure the code quality is good.
At this moment, I cannot say more about this issue.
Well, given the relative small amount of Nim code around and assuming the amount of good code has an influence on the AI work, Nim doesn't do bad at all. It's spiffy but most of all very "writable" (is that a word?)
To measure is to know, but one needs to know how to measure what.
To be clear, my main conclusion, which is also my first impression, is that Nim behaves well under this AI translated benchmark. As I said at the beginning, ~80% difference from C/C++ is pretty good, and its compilation speed and expressiveness have high rankings.
The later discussion is about the subtle difference from some other languages, but as pointed out by other people in the forum, the quality of the Nim code is not that good. Unfortunately, the author of this benchmark has some unusual attitude to this issue, so it may be hard to improve the code quality.
If anyone interested they can make better version it, and it will be far better than arguing with each other ? It is sure that ai generated code are not good for benchmark and also that this benchmark do not compare about the great cpp backend of Nim which will be much more faster.
I think if someone want they can make a better version and real benchmarks, also it will be really good if we have Nim cpp benchmarks also (I don't know why people don't take it seriously).
"We can see C/C++/Rust all get similar score for the runtime performance, with variation < 10%"
The benchmark itself shows a different story(but can't be trusted anyways). The quasi-retarded AI, in the analysis section, after making a grandious claim that "The "safety tax" is a myth," shows in the next paragraph that unsafe C is ~20% faster than unsafe Rust, and it's not even counting the difference in multithread scaling(for 16 threads: 9x for Rust, 12x for C). The way that llms love Rust is ridiculous. But the difference seems too big to even trust this part of benchmark(the rest is even worse).