Hi,
Can we remove mapIt template in the favor of mySeq.map(x=>x+1)? Please let's discuss. There are so many points here, feel free to question all of them.
I believe that map(x=>x+1) is more readable, easier to understand and remember for a newcomer. The map(x=>x+1) form is not working right now (first glitch in this proposal), but it is doable (either by fixing auto or modify => to use generics instead of auto).
Regarding the speed concern: I think that the C compiler should be smart enough to optimize. I could only test is for gcc, and it was true (with my simple function). If you have other compilers (e.g., running Windows), please give me some results by running https://github.com/petermora/nimMapBenchmarks (There are so many ways to ruin a benchmark, please write me if I did something wrong.)
I believe (not checked) that with the --opt:size flag the C compiler will give smaller binary size if we have a proc and not a template. Nim can be use on a microcontroller where the memory is limited.
There is a thread about possible lambda forms: https://github.com/nim-lang/Nim/issues/2179 Maybe we could have something more clever.
Side note: I also have a secret dream about nimfmt will be smart enough that every time I reformat my source code (by saving in vim), then all the mySeq.map(x=>x+1) will be replaced by mySeq.map(x:int => x+1). This way I don't have to write types, but I could always read them.
Thanks, Peter
test0: call system.nim's map()
50024000000,3.457811117172241
test1: reimplement system.nim's map() here, and call that
50024000000,2.145495891571045
test2: reimplement system.nim's map() here with {.inline.}, and call that
50024000000,2.202944040298462
test3: map() where we iterate using it and not i
50024000000,1.920763969421387
test5: template map(), f is an expression (e.g., it+10+i), no type magic, hard coded types
50024000000,2.104836940765381
test8: template map(), f is an expression (e.g., it+10+i), iterating using index i and not it, no type magic
50024000000,1.928750038146973
test9: call test1's map() which is not (.inline.}
50024000000,3.367489099502563
test10: call test2's map() which is (.inline.}
50024000000,2.099898099899292
That said, the very fact that I can use functional programming idioms, while still avoiding the overhead of calling functions inside inner loops, is one of the strong reasons that drive me towards Nim. I agree that mapIt is less obvious, but I am very glad that it is exists
It's better to improve the type inference, the issue is here: https://github.com/nim-lang/Nim/issues/3127
On win7 64bit Nim 0.11.3 gcc version 4.8.1 (tdm64-2)
Note: test6 calls itself test5
test 4, 5, 7, 11 failed like: test7.nim(4, 20) Error: cannot instantiate: 'U'
test0: call system.nim's map()
50024000000,3.109207630157471
test1: reimplement system.nim's map() here, and call that
50024000000,2.390804290771484
test2: reimplement system.nim's map() here with {.inline.}, and call that
50024000000,2.357604265213013
test3: map() where we iterate using it and not i
50024000000,2.357604265213013
test5: template map(), f is an expression (e.g., it+10+i), no type magic, hard coded types
50024000000,1.82520341873169
test8: template map(), f is an expression (e.g., it+10+i), iterating using index i and not it, no type magic
50024000000,2.35560417175293
test9: call test1's map() which is not (.inline.}
50024000000,3.451606273651123
test10: call test2's map() which is (.inline.}
50024000000,2.375605344772339
Sorry for the compile error (my compiler is modified), I'll fix it tonight.
@andrea: If there is any example where the C compiler cannot inline the lambda function, then I think that we should either keep mapIt or modify nim's compiler to inline it for us.
I see. I get your point. Let me think about how we could ensure that inlining happens all the time, and still keep the syntax simple.
Thanks again for brainstorming on this.
Actually this is a very special case. The map function itself is very short (basically one allocation and one for cycle), the lambda function is created only for this parameter (e.g., map(x=>x+1)) and nowhere else used. In this special case everything should be inlined: the map call and also the lambda function. We all agree on this, and mapIt does exactly this.
I did a test with nodejs, and all the function version of map run for 40 seconds, while the template versions run for 6 seconds. So again, one point for mapIt.
However, let me share one example which can't be do nicely with mapIt. Let's assume that we have a seq[seq[float]], that is seq of seqs of floats. We would like to create a new structure where each of the numbers are divided by the length of the seq its placed in. That is:
echo mySeqSeq.map(row => row.map(x => x/row.len))
I don't know how to implement this with two mapIt calls.Let me give a quick update on this topic.
It looks like the C compiler optimize the code correctly, but in NodeJS the template is much faster than the proc version.
I implemented the same code in Rust, and it is faster than Nim. In Nim the fastest code uses 2.77 sec (that's from the linux command time: "real 2.77 sec", "user 1.22 sec" and "sys 1.5 sec"). The Rust version runs in 1.70 sec (that's "real 1.70 sec", "user 1.70 sec", and "sys 0.00 sec"). My guess is that the memory allocation is different and not optimized for this code.
Based on @andrea's feedback and the JS benchmark, I would say that map should be a template, and should be smart enough to recognize mySeq.map(x=>x+1), and inline it properly. The generated AST in this case should be the same as mySeq.mapIt(it+1) gives. What do you think?
Quick update.
First of all, I've just checked out the latest devel branch, and some of the map versions in my benchmark has the same speed as Rust has!!! Great job, Guys!!! I love Nim!
I've updated the source files and the results here: https://github.com/petermora/nimMapBenchmarks
Based on the results I would say that we should have all the functions (map, filter, etc.) as templates, because the compilers are more likely to inline the functions passed as parameter.
Araq said that we might be able to truly inline the function given as parameter. I did something like that here: https://github.com/petermora/nimMapBenchmarks/blob/master/test12.nim The trick is that I modify the AST, converting from a lambda function to a template, which is inlined by the compiler. This is a proof of concept and not an actual proposal, there are many potential problems with this (for example correct handling of return in the lambda function, etc.) By the way, with this method and a slightly modified futures.nim, I could have a working version of mySeq.map(x=>x+1) for the very first time :)
It looks like it matters whether we iterate using for it in seqIn or by an index for i in 0..<seqIn.len . I'll keep it in mind when implementing functions.
In summary: what do you think if we have map, filter, etc. functions as templates? These are short functions (allocation, for cycle), so making them templates have little effect on the the binary's size, but apparently it's a significant speed up.
Thanks, Peter