nimforum mirror - Mojo Language: Similarities/Differences with Nim, Potential Lessons for Adoption

turmoil (orginal) [2023-05-03T01:42:01+02:00] view original

I wanted to highlight the Mojo language, as I really think there is some impressive engineering work being done here. It could be a bit early to say, as I personally haven't tried the language yet, and the compiler is not open sourced (but is planned to be).

Here is a link to a HN discussion: https://news.ycombinator.com/item?id=35790367

There are obvious similarities to Nim with Mojo both in philosophies (one language to rule them all) and syntax.

My opinion is we (the community & the compiler devs) should learn from the advancements being made here. I will focus on the "UX" of the language, as opposed to the syntax and semantics of the language (I have not yet gone over this in detail), there are probably things to learn there too, but I think the following features are critical lessons/inspiration Nim can learn from.

Before I highlight these points, I'm not an expert on Nim, I've mostly written toy applications to try out Nim. So please correct me if I am misrepresenting the language or directions behind development of the language.

I have highlighted the following points as (=) as on-par with Mojo's feature, (-) as missing from Nim, (+) currently better supported in Nim with potential caveats:

(-) REPL and Jupyter integration
- Researchers in Machine Learning/AI like to move fast. Their workflow is to use a REPL to iterate on ideas quickly.
- This development workflow can be applied in other areas, such as for games or interactive applications.
- When I first came to Nim, I am surprised that this feature was not implemented from the get-go. This is one of the main selling points with Lisp: the tooling surrounding REPL-based development. Lisp being one of the languages Nim is explicitly inspired from.
- Hot-code reloading is a feature that tried to solve this. With nlvm's recent PR regarding ORCv2, we could achieve this now with some development cycles allocated there. I wish the LLVM-backend was supported officially / merged into the official compiler.
- Zig's approach (yet to be finished) is by writing a linker which patches your binary at run-time, with tight coupling of the linker and compiler backend
- Other ways to achieve this is to compile to some high-level IR, e.g. WASM and embed a WASM runtime into your application, e.g. using https://github.com/bytecodealliance/wasm-micro-runtime

(=) easy Python interop
- Nimpy pretty much fits this bill
- Lesson: we should take this library more serious via officially supporting this and make type conversion efficient.

(+) easy C and C++ interop
- Mojo does not support this for right now, so I'm giving the + to Nim for now, as you can manually do this. It seems that Mojo will over-take Nim here, possibly when they open-source.
- in Nim this is a manual process, e.g. c2nim's approach for C code is good, assuming the underlying C API is a non-moving target, but sometimes you want to just use a library without good bindings if they don't exist.

(-) GPU support via MLIR and LLVM.
- These pieces of tech seem to be a very critical component of Mojo
- Having LLVM and/or MLIR output generation should be prioritized for the future of compute (to enable custom hardware)
- As a side note, PyTorch 2.0 recently announced it's torch.compile feature which generates CUDA / C++ code on the fly and JIT compiles these kernels for Machine Learning

My person opinion is Nim is a good language, but I really would like to see more advancements in these areas.

Nim at the moment has one big selling point: deployment of your application (small binaries) and mixing output targets. Theoretically one could compile their application to:

JavaScript

x86, ARM (via C/C++/Obj-C backends or LLVM)

WASM via emscripten (why isn't this an official backend?)

All in one binary. Though to be honest, to me it's not clear on how to create a mixed target output language with a build script in nim. Is it possible to compile your application to both JavaScript and WASM and have those pieces interact? I assume so, but easy-to-use tooling surrounding this with examples in the community would be a big selling point.

With that direction and with (3), if we took the MacOS/iOS eco-system more seriously, we could integrate a @cImport / #include into the compiler could include C, C++ and Objective-C. C++ being less of a priority IMO. Obj-C to C (or C++) translation can theoretically be done, which is something I have been exploring with libclang, but I just have not had the time to work on it.

I see that Nim hits a lot of the features I want in a language, I just wish the "UX" of the language was a bit better. Hopefully Nim can learn from Mojo and obtain more adoption.

I would love to know your thoughts, thanks for taking the time reading this. I realize some of this was a bit of a ramble, so apologies on that.

giaco (orginal) [2023-05-03T02:38:13+02:00] view original

just my 2 cents:

for bad looking but perfectly working C auto wrapping you can try futhark

Nim is a compiled language and REPL is not really something that fits well here. It's still possible but you won't turn a square into a circle anyway. You can already compile Nim in jupyter cells and export the function as a native python module for the running kernel, which is pretty sweet.

Nim is quite near to incremental compilation (next step after 2.0 afaik), and that should take us near to Hot Realoading too I suppose (can't wait!)

nimpy is already "type conversion efficient", what do you mean?

Pytorch community (which is "just" a lib) is larger than whole Nim community, and moves a lot of money. You can't expect Nim to stay up to speed with it. On the other hand, compiling models into efficient kernels is something that fits quite well into Nim natural capabilities, but you need tons of brain power and testing to make this real. Expgrad and ArrayMancer are waiting for PR from skilled people capable of doing so.

turmoil (orginal) [2023-05-03T03:06:18+02:00] view original

for bad looking but perfectly working C auto wrapping you can try futhark

My point is, this feature is not officially supported by the compiler team or the language itself. Futhark is not as easy to use as a zig @cImport or C++ #include

Nim is a compiled language and REPL is not really something that fits well here. It's still possible but you won't turn a square into a circle anyway. You can already compile Nim in jupyter cells and export the function as a native python module for the running kernel, which is pretty sweet.

Mojo is a compiled language. Julia is a compiled language. These languages have REPLs. This feature is not exclusive interpreted or VM-based languages. Julia (and I assume Mojo probably) implemented this with LLVM's ORC, hence my reference to nlvm as they recently announced ORCv2 support which would should enable a REPL implementation.

nimpy is already "type conversion efficient", what do you mean?

Ok apologies here I am not knowledged on whether it is efficient or not. I think my point is to ensure there is no copying of data, e.g. given a np.ndarray whether we need to copy to have the data available in "nim-land". Same applies for torch.tensor and etc., e.g. using the dlpack interface to perform efficient moves of data with these libraries.

Pytorch community (which is "just" a lib) is larger than whole Nim community, and moves a lot of money. You can't expect Nim to stay up to speed with it

I am not trying to suggest Nim should already be up to speed with PyTorch or Mojo. I am just trying to show motivating examples of what other people in the industry are doing w.r.t Machine Learning / AI

turmoil (orginal) [2023-05-03T03:19:20+02:00] view original

One benefit you get with a REPL, or specifically JIT, which Mojo apperas to exploit from their Keynote (https://www.modular.com/) is they can dynamically "autotune" functions to perform faster for the hardware the code is being executed on (i.e. search for an optimal choice for the tile_factor constant as they demonstrate at 38:00). To me this seems to be achieved or achievable with LLVM's ORCv2 API.

Obviously LLVM ORC does not have be used to implement a REPL or JIT. One could consider dynamically patching your binary with the code you wish to execute at run-time (which may be how LLVM ORC works under the hood). This is the approach zig is doing for hot-reloading, which theoretically will enable them to implement a JIT (ship the compiler with the binary) and REPL.

Alternatively, the easiest approach to a REPL would be to compile Nim to a VM-based language, specifically JavaScript or WASM and embed the interpreter. Alternatively a REPL over NimScript could be used. However, this VM-based approach has the disadvantage of not being able to generate and execute native x86/ARM/etc. directly but instead through a run-time. This might be a good approach if e.g. LLVM output + ORC is relatively slow for the code you're wanting to execute.

alexeypetrushin (orginal) [2023-05-03T06:36:34+02:00] view original

Nim at the moment has one big selling point: deployment of your application (small binaries) and mixing output targets.

Hmmm, I disagree Language is a tool for weak human brain to deal with complex abstractions and data. It's a tool for humans to improve thinking. Computers don't need language, it's humans who need it. So, the most important feature of the language - that it fits well with weak human brain. And so the most important feature of Nim - is multi dispatch and code analysis. Allowing structuring code in the way close to how humans think and helping to validate and find problems. All those "python, C integrations", "deployment targets, WASM", "zero overhead" and any other fancy keywords are nice but are minor addons. And with AI/ML all those technical nuances going to matter even less.

termer (orginal) [2023-05-03T06:36:42+02:00] view original

If you're looking for a REPL, there's inim which is pretty adequate. It's not the same sort of REPL as you'd expect with a LISP lang where you can do everything in it, but it's more like the Python or Node.js REPL where you can write statements and evaluate them, keeping all of the previous code you wrote in the session active. I used it quite a bit when I was a novice in the language and needed to try things out to see how they worked. Every once in a while I still use it to test a self-contained proc or something.

It's relatively fast, using tcc under the hood for compilation, but it's not going to be anywhere near as fast as an interpreted or partially interpreted language. Its performance is similar or slightly worse than Java's JShell REPL, but slightly more pleasant given the fact that is has colors and the cursor doesn't glitch around as much.

Chronos (orginal) [2023-05-03T09:57:57+02:00] view original

Afaik a LLVM backend isn't official nor planned because you can use clang or even Zig CC, and automatic wrapping of C would either require clang (Futhark uses clang to do this) or having a fully working C compiler reimplementation (and for C++, a C++ compiler which is even harder than C) Same reason for why there's no official WASM backend, emscripten exists (even if WASM would be interesting, but even an LLVM backend would solve this, which again, can be done with clang)

turmoil (orginal) [2023-05-03T19:15:51+02:00] view original

If you're looking for a REPL, there's inim which is pretty adequate. It's not the same sort of REPL as you'd expect with a LISP lang where you can do everything in it, but it's more like the Python or Node.js REPL where you can write statements and evaluate them

It's not the same as a Python REPL. A Python REPL is similar to a LISP REPL. You can redefine methods.

It's sad to say, the implementation of inim seems to be a proof of concept / hack. Don't mis-interpret me, I'm trying to be provide constructive feedback here. In the context of learning the language, I agree that it's helpful: you can dump out code and iterate/learn. But exploring ideas and iterating similar to that: inim is not as useful in that context at least not compared to ipython or Julia (comparing to Julia is a bit unfair, they designed their language around that).

The implementation seems: as you type into the REPL, an underlying virtual source file is appended to, the compiler re-compiles your code and runs it. Hence if you (1) re-define a proc you get an error, if you (2) provide a computationally expensive line to the REPL (e.g. that takes on the order of seconds to minutes to complete), this line will be re-evaluated everytime you input into the REPL. This can be demonstrated simply via:

proc calc(a: int64, b: int64): int64 =
  result = 0
  for i in a..b:
    result += i

var x = calc(1, 2000000000)
var y = calc(1, 2000000000)

x += y
echo x

If you run those line-by line, after var x and var y are defined, consequent lines sent to the REPL are comparatively slower.

A proper REPL needs to be integrated into the compiler, or at least use the compiler's API. It needs to be able to append a diff of the target output to it's own binary or the binary it's executing. It should allow you to re-define procedures. You can't implement these features without the suggestions I made above (at least not to my knowledge). Hot-code reloading is/was promising alternative to my suggestions, but that feature just simply doesn't work anymore (if anyone can get it working, let me know).

didlybom (orginal) [2023-05-04T09:00:06+02:00] view original

For what is worth, I’d love to have a proper nim REPL. Inim is nice for what it is, but it is pretty slow and limited in my experience. Maybe when (if?) incremental compilation arrives it will make it possible?

I had a look at Mojo and it is quite promising if they can deliver on their goals. There are a few things that seem a bit inspired by Nim (and they even mention nim in their announcement). I would miss UFC so much though, it’s a pity they did not copy that! (which makes sense since one of their main goals is to stay pretty close to Python’s syntax).

The integration with MLIR seems particularly interesting…

shirleyquirk (orginal) [2023-05-04T10:18:27+02:00] view original

in inim you can edit functions, you drop down into editor mode and you can see the whole file. I don't use this because it's not useful at all. Neither is the python repl. Redefining multiline functions after a typo in the python repl is annoying as all get out.

The Jupyter notebook model, however, is such a perfect impedance match for fast development, I long for it in Nim.

What I use instead, is a :! nim c -r % shortcut in vim.

didlybom (orginal) [2023-05-04T11:36:23+02:00] view original

Yeah, when I said that I miss a REPL in nim I was really thinking of a nim jupyter notebook kernel (but having a command line based REPL _as well would be nice too).

junelac (orginal) [2023-05-04T12:35:55+02:00] view original

I'm not very convinced by a language for improving python that have mandatory manual management of memory. I think the best bet for a "new" language to have success is to be the glue between the big ecosystems (C, C++, python, javascript, maybe also java, C# and rust).

This language must be much faster than high level language and easier than low level languages but even more important to be easily interoperable with the big ecosystems.

Nim is well placed there, it satisfies the criteria of easiness and good performances, and has probably the better system for interoperability with the C and Javascript backend. However the interoperability story is still too weak and I think focusing more on it could be the way to success for nim.

I don't know if it's possible (probably not) but the killer feature would be to be much more easier to do language 1 => Nim => language 2 than to do Language 1 => Language 2, where language 1 and 2 are at minimum C, C++, python and javascript

And while the REPL is nice for data exploration or code experiment, I think a good development experience (IDE support, debugging) and even more a good deployment experience is much more important right now.

federico3 (orginal) [2023-05-04T12:39:07+02:00] view original

Mojo sounds promising: there's as massive data science/AI community aching for a faster Python with a gentle adoption curve.

For example, the fn functions seems to go in that direction (https://docs.modular.com/mojo/programming-manual.html#fn-definitions), where people first port over regular Python code and then convert and optimize it as needed.

sls1005 (orginal) [2023-05-05T13:45:19+02:00] view original

My understanding is that having some dynamic features like REPL is on the cost of some of static features. For example, Julia (currently) cannot be used to generate independent binary, despite being a compiled language. Thus I would give a (+) to Nim for being able to generate binary executable with a small runtime.

didlybom (orginal) [2023-05-06T00:19:36+02:00] view original

I believe that Mojo has a repl (and it even has a Jupyter kernel) and that it creates (supposedly) small, stand-alone compiled executables.

ElegantBeef (orginal) [2023-05-06T01:00:10+02:00] view original

The major issue I see here is that Mojo's proposed memory/type model is confusing to anyone that does not do system programming. They want to use manual move semantics so fn doThing(owned a: MyType) requires you do doThing(myData^) even though we annotated owned. Not only this but you have to either implement hooks yourself or use annotations for instance the following type is invalid Mojo:


struct MyType:
  var x: int
  var y: int

Since there is not a __init__ with optional __copyInit__ or __moveInit__ it's an uninstantiatable type. One needs to either declare those manually or use @value or another annotation


@value
struct MyType
  var x: int
  var y: int

For a language designed for python programmers they really introduced quite complex programming patterns. In my view they really should've copied the Nim move semantics as they're very programmer friendly.

Mirror of forum.nim-lang.org

10159 :: Mojo Language: Similarities/Differences with Nim, Potential Lessons for Adoption