Hi, I have a test in my linalg library that has been commented out for too long. This is annoying, and I have been trying to debug the error for a while, without figuring what the issue is. It seems to be memory-related.
In short: matrices are represented as follows:
type
Vector32[N: static[int]] = ref array[N, float32]
Matrix32[M, N: static[int]] = object
order: OrderType
data: ref array[N * M, float32]
where orderType is either rowMajor (matrix is laid out row after row) or colMajor (column after column). I have iterators over rows and columns that copy the data inside a new vector.
For efficiency reasons, I introduced unsafe iterators over rows (for rowMajor matrices) or columns (for colMajor matrices). These do not perform a copy, but of course the rows and columns thus obtained are bound to the lifetime of the matrix (hence unsafe).
I tried the obvious cast to a ref
iterator rowsUnsafe*[M, N: static[int]](m: Matrix32[M, N]): Vector32[N] {. inline .} =
if m.order == rowMajor:
for i in 0 .. < M:
yield cast[ref array[N, float32]](addr(m.data[i * N]))
else:
raise newException(AccessViolationError, "Cannot access rows in an unsafe way")
and it mostly looks like it works.
But, in the commented test above, I get memory violation errors. I guess it has to do with the fact that by casting to a ref I am deliberately confusing the garbage collector.
Is there a way to investigate exactly what happens here?
I am open to improve the design of the library, but I am not sure what to do.
On the one hand, I would like vectors to be garbage collected - the library ought to be as easy to use as numpy.
On the other hand, I would avoid a copy while scanning a matrix - since vectors are garbage collected, the cost of allocations here may dwarf the cost of doing the actual operations.
Any idea what to do?
I would certainly use ptr if I found how! :-)
The issue is that clients iterating over rows need to get instances of Vector (in order to apply the existing operations). So, using ptr here would force me to define Vector using a ptr. That would mean that vectors - even in other contexts - would not be garbage collected.
I still have to find a design that accomodates all use cases...
In any case, thank you for the pointers to -d:useSysAssert -d:useGcAssert!
A ref type assumes a reference to GCed memory (and needs the reference count and type information that are stored alongside the actual data). You cannot cast a pointer to the middle of a memory area to a ref type [1]. You can use a ptr type instead (assuming you don't mind it being unsafe), but that will obviously result in a type that's different from Vector32. You could define an UnsafeVector32 type as ptr array[N, float32], but that would obviously require you to duplicate all operations on Vector32.
What you need in practice is automatic dereferencing (currently available only through {.experimental.} mode, and then only for the first argument of a procedure). This way you could define operations on array[N, float32] that would transparently work for both ref and ptr types. There's nothing you can do about that other than bugging Araq, though. :)
(Also, the Boehm GC should make this problem go away, but obviously isn't a solution for a portable library.)
[1] Well, you can if you don't mind it being not portable and never ever – explicitly or implicitly – store the reference on the heap (note that storing a reference on the heap can even happen as a side effect of having it in a closure environment, so you have to be very careful about controlling where such fake references go).
type FloatStorage = ref seq[float32] Vector32[N: static[int]] = object storage : FloatStorage data : ptr float32 Matrix32[M, N: static[int]] = object storage : FloatStorage data: ptr float32
So these are basically smart pointers. The storage reference always points to the base of the heap-allocated object, and the data field points to some arbitrary location inside that storage. You could convert a matrix to a vector without copying by keeping the same storage reference.
After thinking about it, I believe that generics via type classes may offer an alternative. For example:
type
Vector32Array[N: static[int]] = array[N, float32]
Vector32[N: static[int]] = ref Vector32Array[N]
ArgVector32[N: static[int]] =
ref Vector32Array[N] | ptr Vector32Array[N]
proc fill[N: static[int]](v: ArgVector32[N], f: float32) =
for i in 0..N-1:
v[i] = f
proc `$`[N: static[int]](v: ArgVector32[N]): string =
result = "[ "
for i in 0..N-1:
if i > 0:
add result, ", "
add result, $v[i]
add result, ']'
proc `+`[N: static[int]](v1: ArgVector32[N], v2: ArgVector32[N]): Vector32[N] =
new result
for i in 0..N-1:
result[i] = v1[i] + v2[i]
var v: Vector32Array[3]
let x = addr v
var y: Vector32[3]
new y
fill(x, 1.0); echo x
fill(y, 2.0); echo y
fill(addr y[], 3.0); echo y
echo x + x
echo y + y
# echo x + y # this doesn't work
However, this (currently) has a few problems. One, if you have two or more arguments (as in the + operator above) that are generic parameters that depend on the same type class, they cannot be different types. Hence, why echo x + y doesn't work (this is arguably a bug). Two, it would be nice to be able to write ref T | ptr T | var T, but the var part goes away during generic instantiation (hence we can't write stuff like fill(v, 1.0) above). Three, it works only really for function arguments, but not for variables or return values [1] (similar to how openarray is a unifying interface for array and seq, but only if used as an argument).
[1] Though you can do proc example[N: static[int]](v: ArgVector32[N]): type(v) = v
@Jehan: I was curious so I changed this:
proc `+`[N: static[int]](v1: ArgVector32[N], v2: ArgVector32[N]): Vector32[N] =
...
to this instead:
type
...
ArgVector32[N: static[int]] = ref Vector32Array[N] | ptr Vector32Array[N]
ArgVector32b[N: static[int]] = ref Vector32Array[N] | ptr Vector32Array[N]
proc `+`[N: static[int]](v1: ArgVector32[N], v2: ArgVector32b[N]): Vector32[N] =
...
And then the echo x + y line works just fine. Surely that's considered a bug.. and good to be aware of (both the bug and the workaround). I think I've scratched my head over that in the past without realizing it.