Latest nlvm somewhat unusually comes with a new feature - a REPL based on the already present JIT support.
With the REPL / JIT, you can now run Nim code compiled to native CPU instructions without creating binaries - incrementally or from files.
The feature puts together a lot of infrastructure that has been around for some time: LLVM for generating machine code - LTO that sort of performs on-the-fly machine code generation at link time and ORC that replaces the linker and performs on-the-fly linking which extends all the way to machine code generation.
nlvm, in interactive repl mode, feeds llvm IR to ORC incrementally and asks for a machine code address in return that it calls - it was doing similar processing to generate the "main" function already so all it needed to do was re-route the IR to ORC instead of the linker and do so as code becomes available instead of at the end - pretty neat. The initial JIT feature thus becomes a special case of incremental compilation where there is only one increment ;)
The repl input prompt is kind of basic, just like the nim secret command that it is based on.
$ nlvm r
>>> 1+1
2: int literal(2)
>>> proc fact(v: int): int =
... if v<=1: 1 else: fact(v-1)*v
...
>>> fact(5)
120: int
>>> import math
>>> TAU
6.283185307179586: float
>>> sin(TAU) * fact(10)
stdin(6, 10) Error: type mismatch: got <float64, int>
but expected one of:
proc `*`(x, y: float): float
first type mismatch at position: 2
required type for y: float
but expression 'fact(10)' is of type: int
proc `*`(x, y: float32): float32
first type mismatch at position: 2
required type for y: float32
but expression 'fact(10)' is of type: int
11 other mismatching symbols have been suppressed; compile with --showAllMismatches:on to see them
expression: sin(6.283185307179586) * fact(10)
The point about special cases in compilation is interesting - compiling a full program to static binaries can be seen as a special case of a more incremental approach - in an ideal world, the pipeline of nlvm would be incremental and integrated with each step of compilation - right now, it is unnecessarily inefficient in that the Nim VM is used to execute compile-time code while the JIT is used for incremental runtime execution when LLVM/ORC easily could be used for both significantly speeding up compilation by reducing the number of transformations, redundant components and steps.
Here's how it would work:
As Nim (the semantic part of the upstream compiler) is compiling bits and pieces of Nim code, it passes them to nlvm for LLVM IR generation piece by piece - when Nim needs to run something at compile time (for example to compute a constant), the IR is compiled to machine code on the fly by ORC and executed with native instructions (replacing the built-in nim vm) and the result is given back to Nim which continues its semantic analysis - both the IR and the result of the computation is cached (in memory or on disk) and compilation moves on to the next snippet repeating the process (generate IR, turn it into MC, use MC for compile-time execution, cache outcome, etc).
In each such step, we have full access to llvm: optimisers, inliners, PGO stats collectors etc so we can collect data during compilation one which parts make sense to cache and at what level of detail - the disk cache for example? we can measure during compilation how long something takes to compile and / or how often it is useful during compilation and write only the useful stuff. The "compile" is done when we reach a stable state, ie when there are no more snippets that need to be turned into machine code.
The above iterative approach would replace the traditional, "linear" pipeline where nim does its semantic analysis, then passes the result to nlvm which generates IR, then optimizes, then generates machine code then links (nlvm already does LTO by default which inverts the last two steps, but that's a detail ..).
Anyway, one step at a time - here's the REPL code: https://github.com/arnetheduck/nlvm/pull/60
Just a friendly reminder for everyone who would try this on *nix systems:
You can get command history and left/right arrow navigation with rlwrap:
$ rlwrap nlvm r
Cool stuff! Currently playing around with it and compiled it myself. I wrote that in discord as well but might be better placed in this forum - It seems that the nim.cfg file under nlvm/Nim/config/nim.cfg may cause problems if the user had nim 2.0 installed before with packages installed.
nimblepath="$home/.nimble/pkgs2/" this is the line that is problematic. I can compiled without issues while it is commented out. While it is commented in it causes compilation problems, complaining that the serialization-0.2.0-<hash> library has a bad package name.
What initially surprised me is that the "REPL mode" allows reassigning variables:
~/dev/repl/nlvm/nlvm % ./nlvm r
>>> let x = 3
>>> let x = "lala"
>>> let x = "lele"
>>> x
lele: string
As someone who only ever used REPLs and never thought too deeply about them and their conventions, is this a general REPL behaviour thingy that LLVM stuff enables?
why not invest more in it? Is this crazy?
No, it's not. And LLVM can also give us a better debugging experience.
Are NIR and nlvm complementary?
In my opinion, yes.