nimforum mirror - Platonic: core math concepts

elcritch (orginal) [2023-05-25T20:31:48+02:00] view original

Introducing Platonic, a repo for developing a set of Concepts for core math types like Matrices, Vectors, and Tensors. Ideally if it works well something like this could be added to SciNim for a common interface.

Currently it's mostly experimental to learn where the usability of concepts for basic math types stands. I'm also planning to try and provide a clean interface to Arraymancer Tensor's and some graphics types.

One piece I'm really interested in is the static matrix types. The latest compiler releases have made some of the dependent typing sort of pieces more useable. Running an ML program only to discover an off by one issue after running for 10 minutes sucks. How can we do better?

Araq (orginal) [2023-05-25T20:51:56+02:00] view original

Running an ML program only to discover an off by one issue after running for 10 minutes sucks. How can we do better?

Use slices and make them lenient, using 0.0 for missing values or similar. For small matrixes like 4x4 or 3x3 is makes sense to encode them in the type and use the type system but who cares about the difference between a 1001x1001 matrix and a 1000x1000? Also, the memory management changes, big matrixes should prevent copying or use copy-on-write or atomic RC (or ...) whereas small matrixes should be stored in place.

elcritch (orginal) [2023-05-25T21:24:31+02:00] view original

but who cares about the difference between a 1001x1001 matrix and a 1000x1000?

Actually that's precisely it. Those are incompatible matrices for most useful operations. It'd be much nicer to know that upfront because you messed up a slice rather than at runtime.

Being able to lift that into the type system in a useful way would be helpful for writing programs. Currently dealing with matrices is largely "untyped". You have to run the program a few times to work out mis-matching sizes. For some projects that can take a while.

I don't know how much is possible with Nim currently, but its the most advanced non-dependently typed language I know of. This sorta of thing largely has been impossible before, so I'm curious about the ergonomics / possibilities.

use slices

The goal would also to have a set of common concepts for dynamic vectors, matrices, and tensors too. I mostly copied existing stuff to the dynamics.nim file.

Currently most of SciNim is tied to ArrayMancer's Tensor[T] or sometimes just seq's making it harder to reuse code with say stand alone Cuda or other NN libraries. There's also multiple vector types floating around, where vectors need distinct semantics from matrices or tensors. It's confusing to know what to use when.

Even if this project doesn't establish that it would be nice to get some discussion on the topic. Everything needed for it is out there, it just needs to be gathered and curated.

0.0 for missing values

There's NaN for that.

the memory management changes

Ideally useful concepts for matrices would enable those optimizations being done per matrix library or type. Thats a big part of what would need to be settled. Though I think adding operators that use var types with non-var fallbacks would be generally sufficient.

elcritch (orginal) [2023-05-25T21:31:02+02:00] view original

P.S. an example of a 1001 vs 1000 element square matrix might be deleting the boundary conditions for a CFD or FEA simulation. That'd usually not be good. ;)

mratsim (orginal) [2023-05-25T22:34:07+02:00] view original

When I first came to Nim, I was very excited about the static capabilities. After playing with linalg (the ancestor of neo) which already had static dimensions (https://github.com/andreaferretti/linear-algebra/blob/450e2f4/linalg/private/types.nim#L15-L23) and comparing with my experience as a data scientist, I decided to let go of static dimensions. See:

https://github.com/andreaferretti/linear-algebra/issues/5#issuecomment-302961247

https://github.com/mratsim/nim-rmad/issues/4

Using static dimensions generate significant frictions for both the developer and the user. There are very annoying or unsurmontable challenges like:

let's say you want to implement a flatten procedure that deletes of dimensions of size 1. You can't because the final dimensions depend on runtime data.

Killer problem: how do you load a custom model? you can't do let model = load("path/to/model.onnx"), you would need the end-user to precisely describe all the dimensions inside the model.

Furthermore, Tensorflow is a static framework, as in everything needed to be declared statically (the computation graph in particular) and then compiled. With the advent of PyTorch that is dynamic (see https://www.machinelearningplus.com/deep-learning/tensorflow1-vs-tensorflow2-vs-pytorch), all research switched to Pytorch, Google backpedaled and offered Tensorflow Eager that does not build computation graphs ahead of time, and then created the language Jax.

Running an ML program only to discover an off by one issue after running for 10 minutes sucks. How can we do better?

Verifying that the sizes can be done ahead of time at runtime, each layer of a NN provides the out dimensions like https://github.com/mratsim/Arraymancer/blob/e297e6d/src/arraymancer/nn/layers/conv2D.nim#L184-L202

func outShape*[T](self: Conv2D[T]): seq[int] =
  assert self.weight.value.shape.len == 4
  template kH(): int = self.weight.value.shape[2]
  template kW(): int = self.weight.value.shape[3]
  template pH(): int = self.padding.height
  template pW(): int = self.padding.width
  template sH(): int = self.stride.height
  template sW(): int = self.stride.width
  
  template iH(): int = self.inShape[1]
  template iW(): int = self.inShape[2]
  template dH(): int = 1 # dilation # TODO
  template dW(): int = 1 # dilation
  
  @[
    self.weight.value.shape[0],                    # C
    1 + (iH + 2*pH - (((kH-1) * dH) + 1)) div sH,  # H
    1 + (iW + 2*pW - (((kW-1) * dW) + 1)) div sW,  # W
  ]

And you just check at runtime that the output is compatible with the next layer input.

awr1 (orginal) [2023-05-26T02:10:50+02:00] view original

FWIW I've been using static ranged dimensions for a while in my own (short, 2-4 dimensions) vectors + matrixes plus distinct types for unit vectors, colors, etc. and I really find it quite easy to deal with. (Though my context is graphics/game engineering, not ML/numeric processing, so my comments are not going to be very helpful here)

elcritch (orginal) [2023-05-26T02:53:31+02:00] view original

@mratsim thanks for the info dump! And I'd agree about the friction with static types.

Using static dimensions generate significant frictions for both the developer and the user. There are very annoying or unsurmontable challenges like:

Definitely, static types aren't doable (yet). Though the recent slate of bug fixes has improved some of the issues I'd seen with Neo and trying to use it's statics even 6 months ago.

let's say you want to implement a flatten procedure that deletes of dimensions of size 1. You can't because the final dimensions depend on runtime data.

In that case you'd need the ability to do variable arithmetic proc halve(m1: Matrix[M,N,T]): Matrix[M/2,N/2,T]. Or variations of that. Which has got me thinking of toying with DrNim where the dimensions are purely extra metadata.

Killer problem: how do you load a custom model? you can't do let model = load("path/to/model.onnx"), you would need the end-user to precisely describe all the dimensions inside the model.

Yah there needs to be a way to wire in a Tensor[F] to a Tensor[F, M, N, O] say. Neo has a dyn operator but I don't think that'd scale.

Again, that makes me think more that the indexes would need to be "optional" attributes of the type. Then one could implement a DrNim or even a macro to verify them.

Araq (orginal) [2023-05-26T03:02:03+02:00] view original

Though my context is graphics/game engineering, not ML/numeric processing, so my comments are not going to be very helpful here

What I tried to say. These are just too different domains and require different types even though in math they use the same types.

elcritch (orginal) [2023-05-26T03:04:42+02:00] view original

For actual usable pieces, a set concepts for dynamic math types should be doable I'd think.

Though I'm not sure how you'd do a non-generic Tensor/Matrix/Vector type. How would you implement say a basic sum operator without knowing the data type?

proc sum*[T](m1: Matrix[T]): T =
  result = T.default() # <-- how does the compiler know what to do?
  for v in m1:
    result += v

I would even remove the T in Tensor[T], making the backend even more useful for datamancer and avoiding end users to tell me the type when loading a .npy file, it's already there after all.

I'm not sure I follow. Under the covers numpy has to have a switch on dtype where it calls into specialized functions for each data type it supports right? At some point you need functions that operate on a concrete float or int type.

Wouldn't doing a generic Tensor type would limit users to only types that are supported by that library right? That'd be fine for ML libraries but less useful for generic scientific / numerical code.

Though the non_generic_generics.nim is interesting. I've only read a bit of it.

mratsim (orginal) [2023-05-26T13:49:20+02:00] view original

Though I'm not sure how you'd do a non-generic Tensor/Matrix/Vector type. How would you implement say a basic sum operator without knowing the data type?
I'm not sure I follow. Under the covers numpy has to have a switch on dtype where it calls into specialized functions for each data type it supports right? At some point you need functions that operate on a concrete float or int type.

The inner sum implementation is generic, but on the outer you have an object variant.

For example a PyTorch tensor is associated with the following metadata that arguably can be static:

type
  DeviceIndex = int16
  
  DeviceKind* {.importc: "c10::DeviceType",
                size: sizeof(int16).} = enum
    kCPU = 0
    kCUDA = 1
    kMKLDNN = 2
    kOpenGL = 3
    kOpenCL = 4
    kIDEEP = 5
    kHIP = 6
    kFPGA = 7
    kMSNPU = 8
    kXLA = 9
    kVulkan = 10
  
  Device* {.importc: "c10::Device", bycopy.} = object
    kind: DeviceKind
    index: DeviceIndex

type
  ScalarKind* {.importc: "torch::ScalarType",
                size: sizeof(int8).} = enum
    kUint8 = 0       # kByte
    kInt8 = 1        # kChar
    kInt16 = 2       # kShort
    kInt32 = 3       # kInt
    kInt64 = 4       # kLong
    kFloat16 = 5     # kHalf
    kFloat32 = 6     # kFloat
    kFloat64 = 7     # kDouble
    kComplexF16 = 8  # kComplexHalf
    kComplexF32 = 9  # kComplexFloat
    kComplexF64 = 10 # kComplexDouble
    kBool = 11
    kQint8 = 12      # Quantized int8
    kQuint8 = 13     # Quantized uint8
    kQint32 = 14     # Quantized int32
    kBfloat16 = 15   # Brain float16

Wouldn't doing a generic Tensor type would limit users to only types that are supported by that library right? That'd be fine for ML libraries but less useful for generic scientific / numerical code.

Generic scientific/numerical libraries mostly use float64 or float32. Maybe int64 for constraint programming or integer linear programming.

Otherwise non-numerical types that are used are:

strings in dataframes, those can be supported.

nested tensors of tensors. That might be useful for dataframes, but dataframes require type erasure

tuples of numerical types. That's basically complex.

bigints and modular arithmetic. Those are used in lattice algorithms that in any-case require special routines (FFT is different, you use different algorithm for matrix solving, etc)

So in 6 years of Arraymancer existence, no one mentioned they were using something that was not numerical or strings. I'd rather optimize for ergonomics and usability of the 99.99% in that case.

Araq (orginal) [2023-05-28T05:56:27+02:00] view original

Sorry to disrupt the productive discussion but I have to say it: I love the name Platonic! :-)

alexeypetrushin (orginal) [2023-05-28T06:49:58+02:00] view original

One big limitation in the Python datascience ecosystem is fragmentation between projects.

Is it? It's more like advantage, multiple competing projects independently exploring and trying different paths and ideas.

mratsim (orginal) [2023-05-28T11:19:07+02:00] view original

multiple competing projects independently exploring and trying different paths and ideas.

Until you get to the "coloring" problem and you have to decide between library ecosystem. We can see the dangers in Rust async-std vs tokio ecosystems and tentative to reconcile the two.

alexeypetrushin (orginal) [2023-05-28T14:16:24+02:00] view original

Isn't it the same in JS? Lots of libraries React, Svelte, Angular, ... and you can't share libraries between those ecosystems. And nobody needs or cares about uniting it, quite the contrary it's the advantage, it promotes competition and progress.

boia01 (orginal) [2023-05-28T17:54:24+02:00] view original

It's not the same because Nim is a much smaller community (than JS, Rust, Python, ...). Fragmentation introduces friction, confuses people, increases uncertainty, and divides the community and its efforts. In the absence of a killer app/framework, what Nim needs most is an ecosystem of libraries that work well together and can solve commons problems at least as well as other language ecosystems. Early fragmentation is more of a hindrance than an accelerant to growth.

elcritch (orginal) [2023-05-28T20:18:32+02:00] view original

Additionally it's also great to see how some Julia projects combine diffeq libraries with ML libraries and such, which is only possible if they can share data types.

@alexeypetrushin but yah competition and variety can be good too. Concepts give a chance to achieve both. For example lets say I write an FFT routine using the Vector concept and use Arraymancer for my project, someone else could swap out to a JS based vector library and reuse the FFT routine.

mratsim (orginal) [2023-05-29T08:37:24+02:00] view original

Additionally it's also great to see how some Julia projects combine diffeq libraries with ML libraries and such, which is only possible if they can share data types.

nitpick: They can share datatypes or interface/concepts.

mratsim (orginal) [2023-06-12T08:35:21+02:00] view original

This article expands on why PyTorch (dynamic graph, no compilation steps) won against Tensorflow (static graph, need compilation)

https://www.semianalysis.com/p/nvidiaopenaitritonpytorch

https://thegradient.pub/state-of-ml-frameworks-2019-pytorch-dominates-research-tensorflow-dominates-industry/

Note that PyTorch, Tensorflow, ONNX and friends can load a model from a yaml, json or serialized file without recompilation so layers at least need to be type-erased.

elcritch (orginal) [2023-06-14T03:19:12+02:00] view original

Nice! That makes sense, especially for end users.

Ideally there’d be a few possible levels of coding style. Dynamic is an overloaded name, so I just made them the standard top module names, ie “platonic/vector” is the non-static concept.

I did remove the generic number type from the concepts. Overall I think it’s cleaner and makes it possible to use runtime or concrete number types.

As you mentioned there’s a small set of common scientific types and just defining VectorF64 and VectorF32 concrete types should work fine for when you want to compile with typed arrays. Also IIRC, Julia uses that naming pattern.

So I believe that the non-static concepts should reduce the need to write code specific to either numpy style generic arrays or compile time typed arrays. Hopefully avoiding the two language problem that face both PyTorch and TensorFlow!

Oh I plan to do the same for the statically sized concepts too.

Mirror of forum.nim-lang.org

10223 :: Platonic: core math concepts