Ok, so i've developed a module 'basic2d.nim' which contains matrix/vector/point maths for 2d, as well as some basic 2d utilities commonly used in computational geometry. I will send a pull request if there is intrest to merge this module in the standard library (thats why I write here to find out, please let me know Araq !).
If my 2d work is satisfying, i will continue to create a 3d version of this module.
There are many 'styles' using vectors and matrices. I've done it the way I prefer, which is:
# create a matrix that first rotates, then scales and lastly moves
var m:TMatrix2d = rotate(1.5) & scale(2.0) & move(10.0,20.0)
# create x-axis vector and transform in place by m
# The translational part of m is ignored when transforming vectors
var v:TVector2d=vector2d(1.0,0.0)
var v2:TVector2d = v & m # concatenate vector and matrix
v &= m # concatenates v and m in place into v
#create a point and transform it in place by m
var p:TPoint2d=point2d(30.0,25.0)
p&=m
There is a lot more in this module, the above is just some small examples of the style used.
So my suggestion is to keep TMatrix2d as a 2d matrix, and implement a separate TMatrix3d. Later an general TMatrixT can be implemented, whose main purpose is not transformations, but linear algebra stuff.
Currently TMatrix2d is for float:s only, do you want me to make them generic before I send a pull request?
** CAPLUT= Cut And Paste Loop Unrolling Technology
Hi MFlamer!
I've implemented 2d and 3d modules, and sent a pull request for them, but they are not yet merged into nimrods master on GitHub. However, it seems like they will be accepted (by Araq) so it's very likely they will be in the standard library very soon.
I will accept it, but please ensure your way of unrolling everything explicitly really helps. Modern vectorization in C compilers may work much better with ordinary loops over arrays.
What about some feedback? ;-)
Excellent notation, man.
Can I have a pointer to your repo so I can start playing with it?
I feel that first get it correct, and then get it fast.
steved => https://github.com/ventor3000/Nimrod/tree/master/lib/pure , files basic2d.nim and basic3d.nim
Araq => Modern vectorization in a modern c compiler makes the speed difference unnoticable if any, unmodern vectorization in a lousy c compiler (tinyc?) makes a huge difference. Do you dislike this coding style?
From http://llvm.org/docs/Vectorizers.html#the-slp-vectorizer :
"LLVM has two vectorizers: The Loop Vectorizer, which operates on Loops, and the SLP Vectorizer. "
And in theory loops should be easier to vectorize.
In fact, I dislike this coding style for another reason too: It's incredibly error prone and a typo can remain undetected for a long time since it only affects a single dimension. A typo in a loop produces consistently wrong results for all dimensions and so is much more unlikely to survive testing.
thanks, ventor3000. But I get a compile error with basic2D:
num/basic2d.nim(552, 22) Info: instantiation from here num/basic2d.nim(125, 68) Error: undeclared identifier: 'a'
steved: I think I had the same error compiling with an older (currrent release?) version of nimrod (because of some changes/bugfixes in the templating system I think). Make sure you use the latest nimrod from GitHub, if the problem persists, please let me know.
Araq: Ok, I do some benchmarks and rewrite it if equal or better performance using loops. Right now, however I do not have the time to do this. Can you accept it as is for now, and I insert a TODO in the code for this?
Ok, so i've done benchmarking of matrix multiplication with the current basic3d implementation compared to a loop mased matrix multiplier. I did the berformace test with a cascade multiply of one hundred milion matrices. The intresting thing is that basic3d beats the shit out of a loop based multiplier, beeing about twice as fast.
Code is here: https://github.com/ventor3000/nimrod-lab/blob/master/matrix_bench.nim
Output of my test program is: (times in seconds)
Time for loop matrix multiplication: 12.346
Time for unrolled direct access matrix multiplication (current basic3d): 6.628
This benchmark was done using 32 bit gcc compiled with -d:release under windows 8.
So why is basic3d implementation so much faster? I'm not an expert of code generation from c compilers, but I think the layout of basic3d:s TMatrix3d has a huge advantage. It does not need to use arrays, which avoids indirect indexing of matrix elements (saves a lot of arithmetic operations for the cpu I think?)
What do you thing Araq, can you accept basic3d.nim as is taking this info into account?
Yeah, basic3d.nim compiles fine on master.
It makes sense to specialize 2D and 3D matrices, etc since matrix accessing involves not only indirect indexing but often cache misses due to non-locality.
This is why the serious boys like Fortran-style 2D arrays: big contiguous chunks of memory with stride calculations, even if done in C