nimforum mirror - Help with Matrix concept

planhths (orginal) [2018-04-06T09:51:57+02:00] view original

Hello, bellow is a zero copy transpose function for matrices. What I was trying to do is have a concept which satisfies Matrix or Transpose but also checks for the type, so later I can have a Matrix[T] type. Any help would be appreciated.

type
   AnyMatrix = concept m, var mvar, type M
      M.rows is int
      M.cols is int
      
      m[int, int] is float
      mvar[int, int] = float
      
      type T = t(M)
   
   Matrix = object
      rows, cols: int
      data: seq[float]
   
   Transposed = distinct Matrix

proc `[]`(m: Matrix, i, j: int): float {.inline.} =
   m.data[i * m.cols + j]

proc `[]=`(m: var Matrix, i, j: int, v: float) {.inline.} =
   m.data[i * m.cols + j] = v

template t(m: Matrix): Transposed =
   Transposed(m)

proc `[]`(m: Transposed, i, j: int): float {.inline.} =
   Matrix(m)[j, i]

proc `[]=`(m: var Transposed, i, j: int, v: float) {.inline.} =
   Matrix(m)[j, i] = v

template t(m: Transposed): Matrix =
   Matrix(m)

template rows(m: Transposed): int =
   Matrix(m).cols

template cols(m: Transposed): int =
   Matrix(m).rows

proc matrix(data: seq[float], m: int): Matrix =
   result.rows = m
   result.cols = if m != 0: data.len div m else: 0
   result.data = data

proc `*`(a, b: AnyMatrix): Matrix =
   result.cols = a.cols
   result.rows = b.rows
   newSeq(result.data, result.cols * result.rows)
   for i in 0 ..< b.cols:
      for j in 0 ..< a.rows:
         var s = 0.0
         for k in 0 ..< a.cols:
            s += a[i, k] * b[k, j]
         result[i, j] = s

var a = matrix(@[1.0, 2, 3, 4, 5, 6, 7, 8], 2)
var b = matrix(@[1.0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 4)

var c = a.t * b

miran (orginal) [2018-04-06T10:03:26+02:00] view original

If you're working with matrices (and don't want to re-invent everything by yourself), I can recommend using Neo: https://github.com/unicredit/neo

planhths (orginal) [2018-04-06T10:27:25+02:00] view original

miran: thanks, but I 'm not reinventing anything :) I was porting a another lib to nim.

mratsim (orginal) [2018-04-06T12:03:29+02:00] view original

Here you go

type
   Matrix = object
      rows, cols: int
      data: seq[float]
   
   Transposed = distinct Matrix
   
   AnyMatrix = Matrix or Transposed

Note that instead of transposed you probably want to use column-major or row-major (or C-order/ Fortran-order) because transposed is unclear about the data representation.

planhths (orginal) [2018-04-06T13:57:12+02:00] view original

mratsim: thanks! I was already using that one, but it requires proc signatures to look like:

proc op(a: AnyMatrix, b: AnyMatrix): Matrix

could concepts be usefull in this case so I can restrict to only i.e. a: AnyMatrix[float] and b: AnyMatrix[float]?

planhths (orginal) [2018-04-06T15:53:22+02:00] view original

The manual mentions:

By default, during overload resolution each named type class will bind to exactly one concrete type. We call such type classes bind once types.

so what I am trying is not possible? Can someone confirm?

mratsim (orginal) [2018-04-06T16:19:20+02:00] view original

Generics are enough:

type
   Matrix[T: SomeReal] = object
      rows, cols: int
      data: seq[T]
   
   Transposed[T: SomeReal] = distinct Matrix[T]
   
   AnyMatrix[T: SomeReal] = Matrix[T] or Transposed[T]


proc op[T:SomeReal](a, b: AnyMatrix[T]): AnyMatrix[T] = ...

Make sure you read the code of either Neo (and older linalg) or Arraymancer. Both Andrea and me like to constrain things at compile-time as much as possible and would probably have met most of the struggles you will have while wrapping a matrix library.

planhths (orginal) [2018-04-06T17:12:01+02:00] view original

mratsim: the problem I'm having is:

 Error: type mismatch: got <Transposed, Matrix>

when using the same as your last snippet. On the other hand splitting the types in the proc parameters would allow

 Matrix[Complex] * Matrix[float]

which we don't want.

The libraries you posted are very nice. I need lu factorization and eigenvalues. So I am porting Jama, which is old, but I don't know a better lib.

Note that instead of transposed you probably want to use column-major or row-major (or C-order/ Fortran-order) because transposed is unclear about the data representation.

In Neo transpose function copies the matrix. I was trying to imitate how gonum/matrix implements it

mratsim (orginal) [2018-04-06T18:05:17+02:00] view original

This should works. if you have a, b Nim expects both to resolve to the same AnyMatrix and the same T. An example is a, b: SomeInteger versus a: SomeInteger, b: SomeInteger

proc op[T:SomeReal](a: AnyMatrix[T], b: AnyMatrix[T]): AnyMatrix[T] = ...

Alternatively you can use something like:

proc op[T, U: AnyMatrix](a: T, b: U): T or U = ...

but since Nim doesn't support higher kinded types at the moment you cannot access the subtype of AnyMatrix naturally like

proc op[T: SomeReal, U,: AnyMatrix[T]](a: T, b: U): T or U = ...

it can be done with macro though. But try the first solution I think.

Edit:

Regarding the transpose, if your data is stored in a ref object/array or a shallow seq, Nim will just copy the reference and not the whole data.

planhths (orginal) [2018-04-06T22:04:32+02:00] view original

mratsim: try the code in the beginning and you will see the same error I'm getting. Jama actually uses seq[seq[float]] for storage. But this representation will be better right?

Well here are some benchmarks I made to test different data storage and matrix multiplication algorithms.

type
   MatrixB = object
      data: seq[seq[float]]
      m, n: int
   
   Matrix = object
      data: seq[float]
      m, n: int

proc stdMatrixProduct(a, b: MatrixB): MatrixB =
   result.m = a.m
   result.n = b.n
   newDataB()
   for i in 0 ..< b.n:
      for j in 0 ..< a.m:
         var s = 0.0
         for k in 0 ..< a.n:
            s += a.data[i][k] * b.data[k][j]
         result.data[i][j] = s

proc optMatrixProduct(a, b: MatrixB): MatrixB =
   result.m = a.m
   result.n = b.n
   newDataB()
   var b_colj = newSeq[float](a.n)
   for j in 0 ..< b.n:
      for k in 0 ..< a.n:
         b_colj[k] = b.data[k][j]
      for i in 0 ..< a.m:
         var a_rowi = unsafeAddr a.data[i]
         var s = 0.0
         for k in 0 ..< a.n:
            s += a_rowi[k] * b_colj[k]
         result.data[i][j] = s

results:


8.649us naive data storage - standard ijk algorithm
1.329us naive - algorithm
2.456us packed - standard
2.675us packed - optimized
2.441us packed - strided iteration

seq[seq[float]] is the fastest!

Udiknedormin (orginal) [2018-04-06T23:22:13+02:00] view original

You allocated a wholly new seq in optMatrixProduct, no wonder why it's slower than unoptimized version. By the way: have you tried changing the representation of the matrix before multiplication (so that for k in 0 ..< a.n: a.data[i,k] and b.data[k,j] are both linear in memory)? As far as I know it should give it a nice boost.

seq[seq[float]] plays nice with changing matrix in-place, e.x. reordering the rows (or columns, depending on interpretation). It can be pretty useful, I agree.

mratsim (orginal) [2018-04-07T10:18:15+02:00] view original

You shouldn't compare on Matrix Multiplication with triple for-loops, that's the slowest way to do matrix multiplication.

In short, yes the best memory representation for 2D-only matrices is seq[float] like what Neo and you are doing.

If it's just to solve linear equations Neo has a solve function and I added least_squares equation solver to Arraymancer.

If it's for learning use seq[float] instead of the seq of seq double indirection. You can follow JAMA code then.

If it's for learning how to write fast matrix libraries, use seq[float] too and have a look at BLAS and LAPACK and their nim wrapper nimblas and nimlapack. You will end up writing a lot of glue code though.

Edit note: for a good boost while keeping a simple triple for-loop, you can transpose a matrix like in this Nim benchmark.

Long answer

Matrix multiplication is a very memory intensive algorithm with n^3 operations on n^2 data and what counts is keeping those n^2 in L1 and L2 cache as long as possible.

For speed, you should use a proper BLAS library to implement matrix multiplication, both Neo and Arraymancer use nimblas for that, it's a wrapper for OpenBLAS, Intel MKL or Apple Accelerate. If you're interested in performance for matrices here is a good start on memory layout.

If you really want to implement fast matrix multiplication by hand, you can follow this tutorial in C and Assembly. The main takeaway is that keeping your data hot in cache trumps everything.

I followed it to implement fast matrix multiplication on integer (BLAS libs are float only) and achieved 5x the speed of Numpy (10x on dual core) and 2.2x the speed of Julia (4.4x on dual core) triple for loop on a 1500x1500 matrices. This grows at a rate of n^3 so on 3000x3000 it would be 125x faster. Code is here.

Also I had a look at Jama code, it is good for teaching or as a reference implementation/specification but it is not built for speed:

their matrices are stored as a pointer of pointer (seq[seq[float]] in Nim). This means that to access every element you must dereference the first sequence, and then dereference the second one. Memory is already the bottleneck (1 cycle to do a multiplication or addition but 100~10000 cycle for memory access depending on which cache or RAM it is) for matrix iterations, this double indirection will make that worse.

does not use BLAS and LAPACK (Fortran numerical libraries optimized for years)

Slicing (getmatrix) involves a copy (costly with Nim GC as compared to Java it doesn't reserve several MBs of memory at program start)

Transpose involves a copy (costly for the same reason)

No multithreading/parallel processing

planhths (orginal) [2018-04-07T16:37:43+02:00] view original

Udiknedormin: Actually this code is the fastest (when used with right data structure of course) You think the Jama creators didn't think about allocations :P, that's why it creates a temporary sequence only once instead of a whole new matrix like the kostya/matmul.nim mratsim posted.

mratsim: thanks for the links, I have already done a (bad, incomplete) port I want to ask, if you know other modern libraries that deal with numerical computations which could be ported easily to nim? I don't care about speed I want the functionality.

mratsim (orginal) [2018-04-08T11:37:52+02:00] view original

Probably the easiest matrix libraries to port are those that target either OpenGL applications are those for new languages or less mainstream languages like D, Crystal, Elixir, Elm, ...

Udiknedormin (orginal) [2018-04-08T19:04:30+02:00] view original

@planhths

Yes, I actually overlooked where it happens. Now I see what it does is essentially partial transposition (I mentioned transposition can give nice results before, it also plays nicely with parallelization).

But did you really called THAT one "naive - algorithm"? :-o Doesn't really look naive to me. Did you try the same method with seq[float] representation?

Mirror of forum.nim-lang.org

3729 :: Help with Matrix concept