I would like to use cudaHostAlloc as a custom allocator.
It's a replacement to C malloc that
Here is Nim implementation
proc cudaHostAlloc*(pHost: ptr pointer;
size: csize;
flags: cuint): cudaError_t
{.cdecl, importc: "cudaHostAlloc", dynlib: libcudart.so.}
Questions
I had a look at Nim's memory regions I have a feel that it may help but I don't understand how to use it in practice.
type
UncheckedArray {.unchecked.}[T] = array[0..100_000_000, T]
PinnedArray[T] = ref object
len: int
data: UncheckedArray[T]
# Allocate 50 kB
# this will be replaced by cudaHostAlloc code
var memRegion = alloc(50_000)
# Doesn't compile
var foo: PinnedArray[int] ptr memRegion
# I get "Error: type expected"
# Nimsuggest also says "region needs to be an object type"
Well, just following that manual section and compiler messages, replace memRegion (a variable) in the last line with a type, the type you want to point to, PinnedArray[int] in your case, and to the left to that ptr put another object type, serving to distinguish your memory region, like:
type Cuda = object # nothing needed inside, serves just as a mark
var foo: Cuda ptr PinnedArray[int] # read this as "Cuda-pointer to PinnedArray[int]"
Nimsuggest also says "region needs to be an object type"
That's explicit in that manual section.
UPD
Something that compiles, though too much casts, and maybe not best fits your needs:
type
UncheckedArray {.unchecked.}[T] = array[0..100_000_000, T]
PinnedArray[T] = object
len: int
data: ptr UncheckedArray[T]
Cuda = object
var foo: Cuda ptr PinnedArray[int]
foo = cast[ptr[Cuda, PinnedArray[int]]](alloc sizeOf(PinnedArray[int]))
foo.data = cast[ptr UncheckedArray[int]](alloc 50_000)
foo.len = 50_000
foo.data[][2]=7
echo foo.data[][2]
Thanks, that's really cool.
If I can't use a custom allocator or a custom memory region with "new" or "newSeq" in the next couple months, I will probably go with that.
While direct support for seqs and strings is not here (it's planned, according to the manual), you can wrap them in objects.
type
MySeq[T] = object
data: seq[T]
Cuda = object
var foo = cast[ptr[Cuda, MySeq[int]]](alloc sizeOf(MySeq[int]))
foo.data = newSeq[int]()
foo.data.add 7
echo foo.data[0]
or with constructor and wrapped procs
proc newMySeq[T](size = 0.Natural): Cuda ptr MySeq[T] =
result = cast[ptr[Cuda, MySeq[T]]](alloc sizeOf(MySeq[T]))
result.data = newSeq[T](size)
proc add[T](s: Cuda ptr MySeq[T], v: T) = s.data.add v
proc `[]`[T](s: Cuda ptr MySeq[T], i: int): T = s.data[i]
for more convenient usage
var s = newMySeq[float]()
s.add 5
echo s[0]
Interesting, however if I understand this correctly:
proc newMySeq[T](size = 0.Natural): Cuda ptr MySeq[T] =
result = cast[ptr[Cuda, MySeq[T]]](alloc sizeOf(MySeq[T]))
result.data = newSeq[T](size)
result = ... uses the custom allocator for the pointer to result.data result.data is still allocated via the default allocator within newSeq