While looking for different benchmarks I’ve found It’s supposed to be faster than Julia for many operations, though it doesn’t have so many libraries:
https://gist.github.com/sdwfrost/7c660322c6c33961297a826df4cbc30d
julia_nim_cpp_r_sir.md This gist compares the performance of Julia, Nim, C++ and R - the latter using either POMP, or LibBi in a simple simulation of an SIR epidemiological model. In addition to keeping track of susceptibles, infecteds and recovereds, I also store the cumulative number of infections. Time moves in discrete steps, and the algorithm avoids language-specific syntax features to make the comparison as fair as possible, including using the same algorithm for generating binomial random numbers and the same random number generator; the exception are the R versions, POMP uses the standard R Mersenne Twister for the random number generator; I'm not sure what LibBi uses. The algorithm for generating random binomial numbers is only really suitable for small np.
Benchmarks were run on a Mac Pro (Late 2013), with 3 Ghz 8-core Intel Xeon E3, 64GB 1866 Mhz RAM, running OSX v 10.11.3 (El Capitan), using Julia v0.6.1, Nim v0.17.3, clang v5.0.0, emscripten v1.37.34, node v8.9.1, and R v3.3.2.
Compile using: This file has been truncated. show original nim.cfg
@end
@end
This file has been truncated. show original sir.cpp #include <array> #include <vector> #include <cmath> #include <iostream> #include <climits> #include <limits> #include <iterator> #include <numeric> #include <chrono> This file has been truncated. show original There are more than three files. show original
I think it's great :)
I've modified your post to get rid of that gist you've embedded (a link is enough)
performance usually doesn't matter, and if it matters, is not because of the language, but because of the code written
I dont know why you write this statement -- it is obviously wrong, and I am sure you know that it is wrong.
Languages like Python, Ruby, LabView, Octave, R, Processing and many more are generally very slow, as long as you cannot just call C libs. Do write a Delaunay triangulation, a RTree search, a dijkstra shortest path search in one of these languages and show me how fast your smart code executes.
I came to Nim from Ruby some years ago, was not too unhappy with Ruby, but knowing that your code could be 20 times faster when you create a C routine of it, made me not happy.
Maybe what you want to say was that most modern compiled languages like Rust, D, Nim, Crystal are all very close to C, so that benchmarks makes not much sense.
For me performance is everything and any language that is not as fast as hand-tuned C, Fortran or Assembly is a non-starter.
And the whole world of High-Performance Computing and Machine Learning is the same. When you train models for hours (or even days or week), a mere 20% speed difference is huge.
Of course it completely depends on the IT field you work on. I have experience working with mastodon apps in big teams with teams of more than 20 developers. In that kind of scenario the performance is the smallest problem you will have to handle.
And of course the performance is very important, but in years of experience you see tons of people worried exclusively about execution performance, and there are hundreds of metrics you should have in mind too. Everybody compares languages only based in execution performance when in the business field, development performance usually has a higher value
"performance usually doesn't matter, and if it matters, is not because of the language, but because of the code written"
I dont know why you write this statement -- it is obviously wrong, and I am sure you know that it is wrong.
It's not entirely wrong. For many applications, performance doesn't matter. For those applications that do, algorithms typically matter a lot more than the choice of programming language.
For example, on a sufficiently large list (which probably isn't very large) a custom-written Bubble Sort in the C language (or Nim, or even assembly) will always lose to Python's built-in sort, because TimSort scales far, far better.
Things like this are not that uncommon. This is one reason higher-level languages can outperform hand-coded assembly in some cases: it's easier, and more efficient, to write good, correct algorithms in higher-level languages.
Some people use C++ for speed and Python for rapid prototyping and glue-logic, but combining the two can be tricky. Nim replaces both. It's not quite as fast as C++ (unless you are very careful about memory) and not quite as agile as Python (imo), but when you need both, Nim is great.
Note that Nim iterators correspond to Python generators, an amazingly useful feature. Combined with templates, it swamps C++ in agility. And macros take it to a whole nother level.
Note that Nim iterators correspond to Python generators…
I would like, but they are less powerful. Generators can be recursive in Python while iterators cannot. And in Python it’s easy to use a generator without a for loop thanks to the next function.
But, despite these limitations, iterators in Nim are very pleasant and easy to use, compared with the way to define them in other languages such as Julia. And, I agree with you, they are amazingly useful.
Nim iterators can be recursive, this is my iterator on arbitrarily nested arrays or sequences to construct tensors.
iterator flatIter*[T](s: openarray[T]): auto {.noSideEffect.}=
## Inline iterator on any-depth seq or array
## Returns values in order
for item in s:
when item is array|seq:
for subitem in flatIter(item):
yield subitem
else:
yield item
I do not agree that lazy linear lists are the "main example" of recursion. They may be the simplest example, thusly allow several easy workarounds. I mentioned "trees" right in my comment. Most of the (admittedly subjective) elegance of tree algorithms comes from the way recursive code structure mirrors recursive data structure. lib/pure/collections/critbits.nim:iterator leaves shows some ugliness required to workaround there, for example.
I agree it may not be a high priority or even in the top 10. I do not think it would be considered a "no real need" extra/extraneous fluff/"feature bloat". I know you didn't say that, exactly, but I felt your post risked leaving that impression.
@cblake:
but I felt your post risked leaving that impression.
No, I didn't intend to leave that impression and you make your point. My real point is that iterators are an abstraction to make the basic cases of the use of enumeration over a range or collection of some type easy; I think they will always be limited in use for more complex cases such as recursion and if they did have this capability added, they would then have the danger of causing races. My real point was that there are other ways to implement the effect of recursive iterators for those who really need them, with my lazy linear list example showing one of the ways.
There are always workarounds since CPUs are little state machines. That should be news to no one. Recursion working makes code look much nicer, IMO. For what it's worth, the example considered among the more compelling by the Python guys back in the Python 2.3 days when they added generators was in-order binary search tree traversal:
iterator inorder[T](t: Tree[T]): T =
if t:
for x in inorder(t.left): yield x
yield t.label
for x in inorder(t.right): yield x
which parallels the usual recursive call structure quite well. I think it would be great if the above Nim actually worked. Equivalent state machines are often ugly, trickier to get right, and a pain point (and yeah, usually faster). But "expressive" has always been a main goal of Nim. :-)I'm surprised everyone is treating owned and unowned refs as if they were something new. C++ has owned and unowned refs. In C++, they're called "std::unique_ptr" and "dumb pointers." The semantics of the C++ version are almost exactly the same as the semantics of the nim version - the one difference is that in the newruntime, before you release the owned ref, you're required to manually clear out any dangling unowned refs - and it verifies that you have done that. In C++, you're not required to clear out the unowned refs. But, that means that in C++, you can end up with dangling unowned refs.
It would actually be very easy to implement the newruntime's semantics in C++, because C++ smart pointers are very flexible. Anybody could have done this at any time. But anybody considering this would have immediately understood the cost tradeoffs: you pay the performance penalty of maintaining the reference counts, and you pay the price of doing the extra work to clear out the unowned refs. In exchange, any code that would have accessed a dangling ref will access a nil pointer instead - which is better, because it's a caught error. I think it's telling that this is not a common thing to do - perhaps this indicates that most programmers don't consider these tradeoffs worthwhile.
@cblake:
Recursion working makes code look much nicer, IMO.
IMO, too. Before coming to Nim, my previous favourite languages were Elm, F#, and Haskell (in no particular order), so that should tell you what I like.
Equivalent state machines are often ugly, trickier to get right, and a pain point...
Again, agreed. Unfortunately, even if the compiler allowed your desired code, it currently isn't extended to handle the case, as {.inline.} iterators are just loops "under the covers", they would have to have re-write rules that include nested loops (which is close to my desired feature of Tail Call Optimization- TCO). If the iterators were "first class" ({.closure.}) iterators then they would be doing proc recursion, which again would require TCO so as not to build stack. Since Nim doesn't support Tail Call Optimization directly and C/C++ compilers only MAY do it with full optimization turned on (it isn't guaranteed), if there are any concerns with depth of the call stack, one will have to write the UGLY state machine code!
the example considered among the more compelling by the Python guys...
I don't think Python has TCO even yet, so depth of tree recursion could be a problem even with your example snippet, depending on the depth of the Tree.
Oh, I had meant to include {.closure.}. Oops. Fixed.
A stack (of something) is fundamentally required for general non-linear/tree cases. TCO can only help your linear cases (partly why I call them "easy"). Good trees have well bounded depth, but yeah, wildly unbalanced ones can cause trouble for any impl, recursive or not. A non-recursive one may minimize the per-stack entry storage a bit better, though.
@jyelon:
I'm surprised everyone is treating owned and unowned refs as if they were something new. C++ has owned and unowned refs....
I'm afraid you are far behind the curve here, Nim's owned/dangling ref's aren't like those of C++ at all, see @Araq's original post on the forum with follow-up discussion and another follow-up thread.
In short, in his research, @Araq found a paper that describes a way of doing this that is not just pure reference counting but rather more like C++ "unique_ptr" but better as it can optionally verify the correctness with reference counting - it is more like a cross between C++ "unique_ptr" and "shared_ptr" but not the same as either. Since then, we have been working at improving on the work in the original research paper to (hopefully) get something even better than that.
But anybody considering this would have immediately understood the cost tradeoffs: you pay the performance penalty of maintaining the reference counts, and you pay the price of doing the extra work to clear out the unowned refs.
In Nim's implementation, reference counts are optional and something that one would turn on for debug/development so as to make sure the compiler/programmer are doing their job properly, then turned of for release in which case there is no reference counters used nor checks made if that option is selected. As to "clearing out unowned ref's", if you are referring to dangling ref's, they are actually just something like C++ "weak_pointer" and thus there is no "clearing out to be done; if checks are turned off and the compiler and programmer are mistaken, then the program may fail by trying to reference a deallocated pointer. As to "clearing out owned ref's when they go out of scope, that is just a simple automatic deallocation just as for any pointer when it is deallocated.
Thus, the Nim version of owned/dangling ref's can be zero cost for release/danger mode.
When the ownership model fails as when one needs to have the equivalent of multiple owners of the same data, then it is proposed that one would implement an override that would do a deep copy just as one would implement a "Clone" trait in Rust for this case. When one wants to share the same data such as across threads, then one does the overrides to implement reference counted pointers that are basically the same as C++ "shared_ptr", but the need for those should be fairly rare and when required the overhead of their use will likely be much less than other overheads, such as those related to thread context switching.
@jyelon: A little further comment with respect to the advantages of Nim's owned ref...
I'm surprised everyone is treating owned and unowned refs as if they were something new. C++ has owned and unowned refs....
As explained in my previous post, Nim's `owned ref`'s are not the same as C++'s `unique_ptr` nor `shared_ptr` but combines the two: the Nim compiler does data flow analysis just as the C++ compiler does for "unique_ptr"'s to determine when these references go out of scope and can be deallocated. However, Nim's version also has (originally intended to be optional) reference counting as C++'s shared_ptr's do to verify the data flow analysis and that the compiler/programmer are correct, which C++ does not. Thus, Nim's "dangling" ref''s are also better than C++'s `weak_ptr in that in Nim, they can be (maybe optionally eventually) checked that they go out of scope before the owned ref's to which they refer.
So in addition to the advantage that these can be (maybe when this is implemented) optional and then zero run-time execution cost, what I didn't mention was that they also can prevent cyclic data memory leaks, which both C++'s `shared_ptr` and Swift's automatic reference counting can have.
In the current implementation, cyclic data causes a crash on attempt to deallocate an `owned ref` which contains a cycle, which is better than an undetected memory leak
There is a (currently unimplemented) proposed extension in the original Bacon/Dingle paper that uses a recursive "deepDestroy" technique to cause destruction of cyclic data that can be implemented without data races; this is somewhat costly when there are many deeply nested ref's as they require extra nested destruction passes, but nesting isn't that common or commonly that deep so this should be minimal and only on deallocation. If a version of this were implemented in Nim, the `newruntime` would work like much the current Garbage Collected Nim other than needing to have a few owned's and unown's injected into the code in a few places, and as things progress, even some of those could possibly be inferred by the compiler.