I didn't know about Nim until it popped up on Hacker News recently. I think it is a very good language, and compares well with another new language that has gotten a lot of attention, which is Julia.
I think Nim stacks up very well vs Julia for numeric programming, when it comes to language features. Nim also seems more mature.
I would like to encourage the language authors/maintainers to think about taking Nim in the direction of numeric computing, statistics, and scientific computing.
thx
Not just a matter of libs I think, or there would be little interest in Julia. I agree with gcoles that Nim as a language has great potential there. For one thing, scientific programmers generally like overloaded operators and procs, and aren't likely to want to write numerics code in, say, OCaml (which I generally quite like) where you have '+' for ints and '+.' for floats.
There was already a brief discussion here http://forum.nim-lang.org/t/589 so there is certainly interest in this.
I'm looking forward to seeing what you come up with Mason. How are you planning on modeling multidimensional arrays in Nim? Nim arrays have compile time bounds as part of their type. seq is a growable vector, which is unnecessary; I think you want a type which is not growable but gets its size at runtime. I guess that you'll use seqs underneath, right?
Orion, you are right, a large number of scientists just want to explore data interactively. So, a Nim repl would help. One reason that R is so popular in spite of being a horrible language is that it's plotting libraries (ggplot2!) are wonderful. Same for MATLAB. Python is making big inroads there because it's a decent language even though dynamically typed, and has extensive libraries. Julia's value proposition is supposed to be that you can have an interactive language that's C fast.
I work more with signal processing than linear algebra, so I'm hoping I can provide a general purpose multidimensional grid library that can be the basis for an algebra library if someone's interested in writing that. The situation would be like NumPy's, only tensors and grids would be compatible, since it's much harder to run out of operators in Nim (* could mean elementwise multiplication and ** could mean tensor multiplication, or vice versa).
I come from 20 years of C/C++/IDL/Python programming in the domain of scientific computing, and after having got really bored with them (particularly C++ and IDL) I started looking at alternatives. So far the two languages that seem most interesting are Julia and Nim: the first one looks promising for doing REPL work and short scripts, while Nim might be a good choice for larger programs, where the ability to have static types checked at compile time helps a lot.
As others have already said, one of the missing parts of the language is some more versatility with arrays (statically sized arrays whose size is decided at runtime - after all, Ada has them since 1983!) and ranges (e.g. passing a[4..7] to a function expecting an array reference). I don't think the lack of scientific libraries should be a problem: Nim is one of the languages with the easiest FFI I know. Of all the scientific libraries I use, I have ported only CFITSIO so far (http://forum.nimrod-lang.org/t/678), but the experience has been pretty pleasant. I might start write some bindings to the HDF5 library pretty soon, too: if anybody is interested, I'll report here my progresses.
@mason_mcgill Sounds good. Any idea how to make multidimensional array access in Nim look like it does in other languages? For example, here's a rough cut of a seq backed multidimensional array with element type and dimensions encoded in type (this isn't to be used for a real matrix library!)
import macros
type
Matrix*[T, D] = ref object
dims: D
data: seq[T]
proc makeMatrix*[T,D](init:T, dims:D): Matrix[T,D] =
var ndims = len(dims)
var count = 1
if ndims > 0:
for i in 0 .. <ndims:
count *= dims[i]
else:
count = 0
new(result)
result.dims = dims
result.data = newSeq[T](count)
for i in 0.. < count:
result.data[i] = init
proc mapIndexRowMajor[D](dims : D, indices : varargs[int]) : int =
result = 0;
for i in 0.. < len(indices):
var prod = 1
for j in (i+1).. <len(indices):
prod *= dims[j]
result += prod * indices[i]
proc mapIndexColumnMajor[D](dims : D, indices : varargs[int]) : int =
result = 0;
for i in 0.. < len(indices):
var prod = 1
for j in 0.. <i:
prod *= dims[j]
result += prod * indices[i]
proc get[T,D](mat : Matrix[T,D], indices : varargs[int]) : T =
result = mat.data[mapIndexRowMajor(mat.dims,indices)]
proc put[T,D](value : T, mat : var Matrix[T,D], indices : varargs[int]) =
mat.data[mapIndexRowMajor(mat.dims,indices)] = value
macro `[]`*(g: var Matrix, indicesAndValue: varargs[expr]): expr =
var indicesTuple = newNimNode(nnkArgList)
var indices = indicesAndValue
for child in indices.children:
indicesTuple.add child
quote do:
get(`g`, `indicesTuple`)
macro `[]=`*(g: var Matrix, indicesAndValue: varargs[expr]): stmt =
let value = indicesAndValue[indicesAndValue.len - 1]
var indicesTuple = newNimNode(nnkArgList)
var indices = indicesAndValue
indices.del(indices.len - 1)
for child in indices.children:
indicesTuple.add child
quote do:
put(`value`, `g`, `indicesTuple`)
Edit: I fixed embarassingly buggy code and incorporated @mason_mcgill's suggestions. It's devilishly hard to figure out how to do some things from the docs. Thanks Mason!
@zio_tom78 Thanks for porting CFITSIO, that's a nice example to start from. I'd also be interested in seeing your bindings to HDF5. I was thinking that for Nim numerics binding to some good, widely used C (not C++!) libraries, like PETSc, would be a good start. Any opinions?
C++ libraries might be better to port over, but as you said, there are some areas where the Nim language may be improved, though I'm reluctant to suggest them until I'm more fluent and have done some ports.
I still like to introduce the [. .] brackets for generic parameter lists in case of ambiguity but most people are not too fond of the idea.
I like it. There's a bit of resemblance to the pragma notation. I assume we'd only use them when we had to disambiguate the [] in UFCS, right? Who are these "most people" who aren't too fond of this?
Nimrod might be able to offer both ease of programming and speed to those in the scientific computing community.
Of course Julia seems to be really strong in this special area.
+1 for having more idiomatic support for multi dimensional arrays in Nim.
com: There have also been a number of criticisms of the Julia benchmarks.
This is interesting, I though Julia is quite fast. Do you have any specific links in mind, or can you please summarize what are the concerns?
It's not that Julia isn't fast (or fast enough for most things). And there's more to a language than speed. Just that there were some criticisms of their benchmarks. Didn't look into it too much though. People made a point that they didn't optimize the code in other languages in the way someone normally would (e.g. using numpy/scipy in python). I didn't really look into this much, but I just remember that it was brought up. They may have newer benchmarks making such comparisons - it was a while ago that I looked into Julia.
Some talk of it here
https://groups.google.com/forum/#!msg/julia-users/l-KXBX6327M/Cfcc9SDEHnsJ
https://github.com/JuliaLang/julia/issues/2412
There were other sites, but I can't remember what they were.
+1 for scientific programming in Nim
There are layers of functionality/complexity required (as I see it), based on comparison to Python and R:
As I see it, the development and provision of libraries is at stage 1 (and is what you are primarily discussing, although Numpy has been mentioned a number of times.) There are quite a few linear algebra libraries currently available in nimble.
To provide the equivalent of Numpy requires (among other things) a dataframe mechanism, which is more than hard-wiring your own seq[]. A dataframe needs to easily handle multiple fields of different types (dates, strings, ints, floats, ....), and easily display the data, .....
I have listed the third option (GroupBy) as distinct to a dataframe, because the result of grouping a dataframe is a datafrsame on steroids (a superset of a dataframe). So I assume dataframes are made up of sequences, and GroupBy thingies are made up of dataframes.
sequence -> dataframe -> groupby
Point 4 would be good if it was easy to port pyplot to Nim (but it is highly Python dependent, IIRC). Someone has already provided a library for accessing gnuPlot, so that might be an option (although its output is not quite as nice)
Point 5 is for handling "large" datasets. Not many people will take it seriously when the program dies because it couldn't fit the data into memory. It needs to seamlessly page data to disk as an option for analysing large datsets. The Spills library does this, but that functionality would need to be included in the dataframe and groupby functionality.
For example, you can link with BLAS (FORTRAN?), or you can use a pure-Go BLAS. That flexibility is important because some people want the speed and comfort of the old workhorses, but others want to avoid integration problems (missing libraries, etc).