I am currently looking for ways to generate CUDA C code.
For example the following fake Hello world (yeah it's an empty function on the GPU).
#include <stdio.h>
#include <iostream>
__global__ void kernel (void){
}
int main( void ) {
kernel<<<1,1>>>();
printf( "Hello, World!\n" );
return 0;
}
For the global in the proc declaration I can create a new pragma with codegenDecl similar to the following to request alignment of variable
{.pragma: align16, codegenDecl: "$# $# __attribute__((aligned(16)))".}
var foo{.aligne16.}: array[100, int]
I just have to figure out the $#
Now for the chevrons notation in the function call, where should I start? Are mixins mentionned in the macro module the way to generate custom C?
Note: I'm aware of cudanim however instead of generating the
kernel<<<1,1>>>();
It will generate something like which is a functionally equivalent alternative
cudaLaunchKernel(1,1, kernel)
There is little you cannot do with emit:
template notSure(x, y) =
{.emit: "kernel<<", x, ", ", y, ">>();".}
notSure(1, 1)
After testing, the proper syntax is:
template squareCuda(bpg, tpb: int, y: var GpuArray, x: GpuArray) =
## Compute the square of x and store it in y
## bpg: BlocksPerGrid
## tpb: ThreadsPerBlock
## Output square<<<bpg, tpb>>>(y,x)
{.emit: ["""square<<<""",bpg.cint,""",""",tpb.cint,""">>>(""",y.data[],""",""",x.data[],""");"""].}
The triple-quote are is a bit verbose. I tried with backticks as well "kernel<<<bpg,`tpb`>>>(y,`x`);"