nimforum mirror - New benchmark

linwaytin (orginal) [2026-03-01T18:01:02+01:00] view original

https://kostya.github.io/LangArena/

Nim is only about 80% slower, which is quite good.

Araq (orginal) [2026-03-01T19:41:44+01:00] view original

These benchmarks only reflect how much effort went into them, the true systems programming languages are all equivalent.

linwaytin (orginal) [2026-03-01T21:51:09+01:00] view original

I agree that this kind of benchmark is not really meaningful for practical cases, and reflects mainly the different levels of optimizations.

Having said that, this benchmark is slightly different from traditional ones. According to its github README, the different implementations are translated from an original version in Crystal, by AI tools. See https://github.com/kostya/LangArena?tab=readme-ov-file#origin--approach

In some sense, this benchmark is fairer than the traditional ones where different people write different implementations. Of course, there are still many factors which can affect the implementations in different languages.

We can see Crystal has higher score. Maybe because it is the original, human-made version. Zig got scores quite different from other systems programming lang maybe because it is new so not much resource the AI can exploit. The reason that Nim is slower may be also similar.

However, languages like C# and Java are also slower than C/C++/Rust. Given that these are popular languages, this might mean GC has indeed some impact on performance?

I have the impression that the recent version of Nim's GC is pretty close to C++/Rust, so the difference there is interesting.

Araq (orginal) [2026-03-01T23:31:54+01:00] view original

translated from an original version in Crystal, by AI tools

That is ... terrible. Though maybe it works out if the AI is allowed to use a profiler...

darkestpigeon (orginal) [2026-03-01T23:59:11+01:00] view original

Build commands:


nim c -d:release --out:target/bin_benchmarks_release src/benchmarks.nim

g++ -Ideps/ -Ideps/base64/include -Wl,-rpath,/opt/homebrew/opt/llvm/lib/c++ -L/opt/homebrew/opt/llvm/lib/c++ -L/opt/homebrew/lib -I/opt/homebrew/include/ -O3 -std=c++20 -Ideps/simdjson target/simdjson.o target/libbase64.o main.cpp -o ./target/bin_cpp_run -lgmp -lre2 -lpthread

gcc -Ideps/base64/include -Ideps/yyjson/src -I/usr/include/ -O2 -Ideps/cJSON -L/opt/homebrew/lib -I/opt/homebrew/include/ main.c -o target/bin_c_run -lgmp target/cJSON.o target/libbase64.o -lm -lpcre2-8 -lpthread target/yyjson.o

No -d:danger for nim, gcc is invoked with -O2 and g++ with -O3. What are we measuring again?

giaco (orginal) [2026-03-04T09:23:30+01:00] view original

also methods used instead of procs.

This just proves that some patterns are workable but not the best.

linwaytin (orginal) [2026-03-04T16:19:09+01:00] view original

Well, my point here is not the quality of the AI-translated code , which I believe is problematic.

The author of this benchmark claims that all code, except for the Crystal, are translated by AI with minimal tweaking.

We can see C/C++/Rust all get similar score for the runtime performance, with variation < 10%, and D/C#/Java are all like 50% slower. These results are quite consistent with my impression about these languages, so I would say this benchmark does give some information.

Another observation compilation speed and expressiveness, where Nim gets pretty good scores. For incremental compilation time, Go is the best, and Nim is roughly the same as Rust. However, if we look at both cold compilation and incremental compilation, then Go/Nim/Dart/Typescript are among the best.

For expressiveness, Nim's score is slightly lower than Scala, and both are much better than the others. This is also consistent with our impression for Nim, right?

Now back to the runtime performance, I would assume that Nim, if properly wrote, should have performance close to C/C++/Rust, and better than C#/D/Java. So the problem is AI does not know about Nim well, but why? This could be just because there are less resources for Nim which can be used for AI training. However, it is also possible that writing optimized Nim code is not simple, so AI doesn't do a good job here.

Let me give an example. Julia, as a language for scientific computing, can have performance near C++. However, for newcomers it is common that they get performance even worse than Python code. This is because Julia is a very flexible language, and you need to write it in a particular way to get the best performance. Thus Julia community has a webpage in their doc, called "Performance Tips."

I am not sure what is the real cause, but I think the results of this benchmark is worth thinking.

Araq (orginal) [2026-03-04T16:37:20+01:00] view original

Sorry, I disagree and will not think about anything related to whatever this thing even is or tries to measure. ;-)

linwaytin (orginal) [2026-03-04T17:04:32+01:00] view original

It is totally fine, but how can we show people that Nim's runtime performance is close to other systems programming langs, if we do not some benchmarks?

Araq (orginal) [2026-03-04T17:50:00+01:00] view original

So do the benchmarks properly, not by having AI do them... Especially not if the AI thinks method is commonly used or whatever the heck this is...

Araq (orginal) [2026-03-04T19:12:20+01:00] view original

Nope, they already exist all you need to do is to use them.

linwaytin (orginal) [2026-03-04T20:08:19+01:00] view original

Good to know that.

And I just noticed that it seems the author of this benchmark does not specify which AI tool they use. I have opened an issue on their repo to ask about this.

linwaytin (orginal) [2026-03-04T21:31:59+01:00] view original

I got the reply from the author of the benchmark. https://github.com/kostya/LangArena/issues/7 He or she said the AI tool is a secrete, and can assure the code quality is good.

At this moment, I cannot say more about this issue.

tcheran (orginal) [2026-03-04T23:11:33+01:00] view original

Test outcomes are pretty strange in many cases. Nim (compiled in C) is winning a couple of gold medals outperforming C too... It shines on encoding but It lags behind in decoding. What about Rust beeing the worst on Mandelbrot... even worse than Python? Nim overall result Is polarized mainly from a test about etc::log parser (a optimized implementaton is probably worth of 6.5 seconds gain and several positions in the overall ranking), with Go the worst for that specific test also when compared with Python. Swift is overall scoring very bad, being often the worst (maybe is related to compiler / target architecture?), quite unlikely IMHO.

ingo (orginal) [2026-03-05T07:06:15+01:00] view original

Well, given the relative small amount of Nim code around and assuming the amount of good code has an influence on the AI work, Nim doesn't do bad at all. It's spiffy but most of all very "writable" (is that a word?)

To measure is to know, but one needs to know how to measure what.

planetis (orginal) [2026-03-05T09:44:04+01:00] view original

Well the Nim benchmark is a direct translation of the Ruby original, no wonder it's terrible.

linwaytin (orginal) [2026-03-05T15:06:56+01:00] view original

To be clear, my main conclusion, which is also my first impression, is that Nim behaves well under this AI translated benchmark. As I said at the beginning, ~80% difference from C/C++ is pretty good, and its compilation speed and expressiveness have high rankings.

The later discussion is about the subtle difference from some other languages, but as pointed out by other people in the forum, the quality of the Nim code is not that good. Unfortunately, the author of this benchmark has some unusual attitude to this issue, so it may be hard to improve the code quality.

Ghazali (orginal) [2026-03-05T15:34:17+01:00] view original

If anyone interested they can make better version it, and it will be far better than arguing with each other ? It is sure that ai generated code are not good for benchmark and also that this benchmark do not compare about the great cpp backend of Nim which will be much more faster.

I think if someone want they can make a better version and real benchmarks, also it will be really good if we have Nim cpp benchmarks also (I don't know why people don't take it seriously).

Ohnmi (orginal) [2026-03-07T10:36:38+01:00] view original

"We can see C/C++/Rust all get similar score for the runtime performance, with variation < 10%"

The benchmark itself shows a different story(but can't be trusted anyways). The quasi-retarded AI, in the analysis section, after making a grandious claim that "The "safety tax" is a myth," shows in the next paragraph that unsafe C is ~20% faster than unsafe Rust, and it's not even counting the difference in multithread scaling(for 16 threads: 9x for Rust, 12x for C). The way that llms love Rust is ridiculous. But the difference seems too big to even trust this part of benchmark(the rest is even worse).

Mirror of forum.nim-lang.org

13759 :: New benchmark