Introducing Platonic, a repo for developing a set of Concepts for core math types like Matrices, Vectors, and Tensors. Ideally if it works well something like this could be added to SciNim for a common interface.
Currently it's mostly experimental to learn where the usability of concepts for basic math types stands. I'm also planning to try and provide a clean interface to Arraymancer Tensor's and some graphics types.
One piece I'm really interested in is the static matrix types. The latest compiler releases have made some of the dependent typing sort of pieces more useable. Running an ML program only to discover an off by one issue after running for 10 minutes sucks. How can we do better?
Running an ML program only to discover an off by one issue after running for 10 minutes sucks. How can we do better?
Use slices and make them lenient, using 0.0 for missing values or similar. For small matrixes like 4x4 or 3x3 is makes sense to encode them in the type and use the type system but who cares about the difference between a 1001x1001 matrix and a 1000x1000? Also, the memory management changes, big matrixes should prevent copying or use copy-on-write or atomic RC (or ...) whereas small matrixes should be stored in place.
but who cares about the difference between a 1001x1001 matrix and a 1000x1000?
Actually that's precisely it. Those are incompatible matrices for most useful operations. It'd be much nicer to know that upfront because you messed up a slice rather than at runtime.
Being able to lift that into the type system in a useful way would be helpful for writing programs. Currently dealing with matrices is largely "untyped". You have to run the program a few times to work out mis-matching sizes. For some projects that can take a while.
I don't know how much is possible with Nim currently, but its the most advanced non-dependently typed language I know of. This sorta of thing largely has been impossible before, so I'm curious about the ergonomics / possibilities.
use slices
The goal would also to have a set of common concepts for dynamic vectors, matrices, and tensors too. I mostly copied existing stuff to the dynamics.nim file.
Currently most of SciNim is tied to ArrayMancer's Tensor[T] or sometimes just seq's making it harder to reuse code with say stand alone Cuda or other NN libraries. There's also multiple vector types floating around, where vectors need distinct semantics from matrices or tensors. It's confusing to know what to use when.
Even if this project doesn't establish that it would be nice to get some discussion on the topic. Everything needed for it is out there, it just needs to be gathered and curated.
0.0 for missing values
There's NaN for that.
the memory management changes
Ideally useful concepts for matrices would enable those optimizations being done per matrix library or type. Thats a big part of what would need to be settled. Though I think adding operators that use var types with non-var fallbacks would be generally sufficient.
Using static dimensions generate significant frictions for both the developer and the user. There are very annoying or unsurmontable challenges like:
Furthermore, Tensorflow is a static framework, as in everything needed to be declared statically (the computation graph in particular) and then compiled. With the advent of PyTorch that is dynamic (see https://www.machinelearningplus.com/deep-learning/tensorflow1-vs-tensorflow2-vs-pytorch), all research switched to Pytorch, Google backpedaled and offered Tensorflow Eager that does not build computation graphs ahead of time, and then created the language Jax.
Running an ML program only to discover an off by one issue after running for 10 minutes sucks. How can we do better?
Verifying that the sizes can be done ahead of time at runtime, each layer of a NN provides the out dimensions like https://github.com/mratsim/Arraymancer/blob/e297e6d/src/arraymancer/nn/layers/conv2D.nim#L184-L202
func outShape*[T](self: Conv2D[T]): seq[int] =
assert self.weight.value.shape.len == 4
template kH(): int = self.weight.value.shape[2]
template kW(): int = self.weight.value.shape[3]
template pH(): int = self.padding.height
template pW(): int = self.padding.width
template sH(): int = self.stride.height
template sW(): int = self.stride.width
template iH(): int = self.inShape[1]
template iW(): int = self.inShape[2]
template dH(): int = 1 # dilation # TODO
template dW(): int = 1 # dilation
@[
self.weight.value.shape[0], # C
1 + (iH + 2*pH - (((kH-1) * dH) + 1)) div sH, # H
1 + (iW + 2*pW - (((kW-1) * dW) + 1)) div sW, # W
]
And you just check at runtime that the output is compatible with the next layer input.@mratsim thanks for the info dump! And I'd agree about the friction with static types.
Using static dimensions generate significant frictions for both the developer and the user. There are very annoying or unsurmontable challenges like:
Definitely, static types aren't doable (yet). Though the recent slate of bug fixes has improved some of the issues I'd seen with Neo and trying to use it's statics even 6 months ago.
let's say you want to implement a flatten procedure that deletes of dimensions of size 1. You can't because the final dimensions depend on runtime data.
In that case you'd need the ability to do variable arithmetic proc halve(m1: Matrix[M,N,T]): Matrix[M/2,N/2,T]. Or variations of that. Which has got me thinking of toying with DrNim where the dimensions are purely extra metadata.
Killer problem: how do you load a custom model? you can't do let model = load("path/to/model.onnx"), you would need the end-user to precisely describe all the dimensions inside the model.
Yah there needs to be a way to wire in a Tensor[F] to a Tensor[F, M, N, O] say. Neo has a dyn operator but I don't think that'd scale.
Again, that makes me think more that the indexes would need to be "optional" attributes of the type. Then one could implement a DrNim or even a macro to verify them.
Though my context is graphics/game engineering, not ML/numeric processing, so my comments are not going to be very helpful here
What I tried to say. These are just too different domains and require different types even though in math they use the same types.
For actual usable pieces, a set concepts for dynamic math types should be doable I'd think.
Though I'm not sure how you'd do a non-generic Tensor/Matrix/Vector type. How would you implement say a basic sum operator without knowing the data type?
proc sum*[T](m1: Matrix[T]): T =
result = T.default() # <-- how does the compiler know what to do?
for v in m1:
result += v
I would even remove the T in Tensor[T], making the backend even more useful for datamancer and avoiding end users to tell me the type when loading a .npy file, it's already there after all.
I'm not sure I follow. Under the covers numpy has to have a switch on dtype where it calls into specialized functions for each data type it supports right? At some point you need functions that operate on a concrete float or int type.
Wouldn't doing a generic Tensor type would limit users to only types that are supported by that library right? That'd be fine for ML libraries but less useful for generic scientific / numerical code.
Though the non_generic_generics.nim is interesting. I've only read a bit of it.
Though I'm not sure how you'd do a non-generic Tensor/Matrix/Vector type. How would you implement say a basic sum operator without knowing the data type?
I'm not sure I follow. Under the covers numpy has to have a switch on dtype where it calls into specialized functions for each data type it supports right? At some point you need functions that operate on a concrete float or int type.
The inner sum implementation is generic, but on the outer you have an object variant.
For example a PyTorch tensor is associated with the following metadata that arguably can be static:
type
DeviceIndex = int16
DeviceKind* {.importc: "c10::DeviceType",
size: sizeof(int16).} = enum
kCPU = 0
kCUDA = 1
kMKLDNN = 2
kOpenGL = 3
kOpenCL = 4
kIDEEP = 5
kHIP = 6
kFPGA = 7
kMSNPU = 8
kXLA = 9
kVulkan = 10
Device* {.importc: "c10::Device", bycopy.} = object
kind: DeviceKind
index: DeviceIndex
type
ScalarKind* {.importc: "torch::ScalarType",
size: sizeof(int8).} = enum
kUint8 = 0 # kByte
kInt8 = 1 # kChar
kInt16 = 2 # kShort
kInt32 = 3 # kInt
kInt64 = 4 # kLong
kFloat16 = 5 # kHalf
kFloat32 = 6 # kFloat
kFloat64 = 7 # kDouble
kComplexF16 = 8 # kComplexHalf
kComplexF32 = 9 # kComplexFloat
kComplexF64 = 10 # kComplexDouble
kBool = 11
kQint8 = 12 # Quantized int8
kQuint8 = 13 # Quantized uint8
kQint32 = 14 # Quantized int32
kBfloat16 = 15 # Brain float16
Wouldn't doing a generic Tensor type would limit users to only types that are supported by that library right? That'd be fine for ML libraries but less useful for generic scientific / numerical code.
Generic scientific/numerical libraries mostly use float64 or float32. Maybe int64 for constraint programming or integer linear programming.
Otherwise non-numerical types that are used are:
So in 6 years of Arraymancer existence, no one mentioned they were using something that was not numerical or strings. I'd rather optimize for ergonomics and usability of the 99.99% in that case.
One big limitation in the Python datascience ecosystem is fragmentation between projects.
Is it? It's more like advantage, multiple competing projects independently exploring and trying different paths and ideas.
multiple competing projects independently exploring and trying different paths and ideas.
Until you get to the "coloring" problem and you have to decide between library ecosystem. We can see the dangers in Rust async-std vs tokio ecosystems and tentative to reconcile the two.
Additionally it's also great to see how some Julia projects combine diffeq libraries with ML libraries and such, which is only possible if they can share data types.
@alexeypetrushin but yah competition and variety can be good too. Concepts give a chance to achieve both. For example lets say I write an FFT routine using the Vector concept and use Arraymancer for my project, someone else could swap out to a JS based vector library and reuse the FFT routine.
Additionally it's also great to see how some Julia projects combine diffeq libraries with ML libraries and such, which is only possible if they can share data types.
nitpick: They can share datatypes or interface/concepts.
Note that PyTorch, Tensorflow, ONNX and friends can load a model from a yaml, json or serialized file without recompilation so layers at least need to be type-erased.
Nice! That makes sense, especially for end users.
Ideally there’d be a few possible levels of coding style. Dynamic is an overloaded name, so I just made them the standard top module names, ie “platonic/vector” is the non-static concept.
I did remove the generic number type from the concepts. Overall I think it’s cleaner and makes it possible to use runtime or concrete number types.
As you mentioned there’s a small set of common scientific types and just defining VectorF64 and VectorF32 concrete types should work fine for when you want to compile with typed arrays. Also IIRC, Julia uses that naming pattern.
So I believe that the non-static concepts should reduce the need to write code specific to either numpy style generic arrays or compile time typed arrays. Hopefully avoiding the two language problem that face both PyTorch and TensorFlow!
Oh I plan to do the same for the statically sized concepts too.