I've made some code to automatically PGO the Nim compiler quite a long time ago, but never got to actually get something useful out of it until now :)
If you don't know, PGO allows you to profile programs on real data and then feed that profile to the compiler (the C compiler in our case) so it can decide how to optimize code based on real program usage.
So, here's a small post made with nimib that shows the difference in compiler timings (with --compileOnly, obviously no C compilation step) between different versions compiled with GCC/Clang with release, danger, release + lto, danger + lto, danger + lto + pgo (didn't bother to make release + lto + pgo, sorry).
Read it here, source - in my nim-snippets repo. That folder also has all other files used to actually compile the compiler binaries and benchmark them.
If you don't want to read the full post, the TL;DR is (YMMV):
Funnily enough, the graphs are mostly consistent, with speed-up factors being constant regardless of the project (even if it wasn't in the profiling data).
I think that PGO'd compiler might be useful if you have a big enough project that even the Nim stage takes a long time, or if you want to have your development iteration cycle be as fast as possible.
If you want to PGO your own program (which might be much more useful than PGO'ing the compiler), see the compile_pgo.nim file in the repo or read https://forum.nim-lang.org/t/6295 (simpler but more manual process).
And regarding LTO - just always use it for release builds unless it makes your program slower or creates weird bugs - it's free performance for your users.
Interesting - I wonder if the results hold up on slightly larger projects as well!
Compiling Nimbus (https://github.com/status-im/nimbus-eth2/) now is creeping up into the minutes range, most of the time being spent in the garbage collector (see https://forum.nim-lang.org/t/8267#53240) - this is likely to be different in some ways for ORC (those traces were done with 1.2, but the compiler has gotten significantly slower since then - the 1.2 -> 1.6 upgrade introduced quite a bit of compile time, not sure what devel does), but some of the effects are due to the compiler allocating and deallocating lots and lots of small objects.
Update: I tried PGO'ing the Nim status compiler which is I think based around 1.6.10 with a bit of custom commits, so it uses refc. Judging by the results, it seems like compiler built with refc (which is the case for Nim 1.6.x) benefits much less from PGO - the Nim compilation part for the compiler itself takes 6 seconds with default compiler, and 5 seconds with Clang PGO'd compiler with danger.
Compare that with devel, where the normal compiler built with gcc and release takes same 6 seconds, but PGO'd compiler takes 3.5 seconds!
So, the conclusion is simple: ARC/ORC are just much much better suited to optimization like PGO, I guess I should've known that :)
Regarding nimbus - PGO'd version of the Status compiler fork actually takes longer than the default one ¯_(ツ)_/¯.
Hello,
all these links are dead and I cannot access the linked forum post https://forum.nim-lang.org/t/6295 !
How could I compile with PGO support?
Not sure what happened to Yardanico and his Github account (hoping all is well).
Somebody saved his repos, so you can find PGO files here: https://github.com/nimbackup/nim-snippets/tree/master/pgo
Has the 6295 forum post been moderated? The error message is not explicit about it.
By the way, I googled Yardanico's full name both in latein and cyrillic and obtained absolutely no results. Wish him all the best.