What would be the equivalent of the following code in Nim (especially the rigidbody_t structure and physics_integrate function)?
I want to create job functions running on (long-living) worker threads (#cores - 1) that can act on batches of data (Structure of Arrays or Array of Structures aligned to cacheline) to reduce cache misses.
Is there an easy way to do this in Nim?
Sure.
type
vector4_t* {.bycopy.} = object
x*: cfloat
y*: cfloat
z*: cfloat
w*: cfloat
const
RIGIDBODY_SIZE* = 64
type
rigidbody_t* {.bycopy.} = object
len*: int
position*: array[RIGIDBODY_SIZE, vector4_t]
velocity*: array[RIGIDBODY_SIZE, vector4_t]
proc physics_integrate*(rigidbody: rigidbody_t) =
## Load
let positions = rigidbody.position
let velocities = rigidbody.velocity
## Transform
for i in 0..< rigidbody.len:
## Load
let position = positions[index]
let velocity = velocities[index]
## Transform
let result = vector4_add(position, velocity)
## Store
positions[index] = result
## Store
The alignment pragma at type level is in heavy development at https://github.com/nim-lang/Nim/pull/11077
Otherwise you have to declare all your variable like so
{.pragma: align16, codegenDecl: "$# $# __attribute__((aligned(16)))".}
let a {.align16.} = [float32 1.0, 2.0, 3.0, 4.0]
I suggest you use one of the following to enforce alignment at the moment:
# X86-only
when defined(vcc):
{.pragma: x86_type, byCopy, header:"<intrin.h>".}
{.pragma: x86, noDecl, header:"<intrin.h>".}
else:
{.pragma: x86_type, byCopy, header:"<x86intrin.h>".}
{.pragma: x86, noDecl, header:"<x86intrin.h>".}
type
vector4_t* {.importc: "__m128", x86_type.} = array[4, float32]
template x(v: vector4_t): float32 =
v[0]
template y(v: vector4_t): float32 =
v[1]
template z(v: vector4_t): float32 =
v[2]
template w(v: vector4_t): float32 =
v[3]
template `x=`(v: vector4_t, val: float32) =
v[0] = val
template `y=`(v: vector4_t, val: float32) =
v[1] = val
template `z=`(v: vector4_t, val: float32) =
v[2] = val
template `w=`(v: vector4_t, val: float32) =
v[3] = val
var a: vector4_t
a.x = 1
a.y = 2
a.z = 3
a.w = 4
echo a.x
echo a.y
echo a.z
echo a.w
## Note "echo a" seems to be broken
Or
# GCC/CLang only but works on X86, ARM, MIPS ...
{.emit:"""
typedef float vector4_t __attribute__ ((vector_size (16)));
"""
}
type vector4_t {.bycopy, importc.} = array[4, float32]
template x(v: vector4_t): float32 =
v[0]
template y(v: vector4_t): float32 =
v[1]
template z(v: vector4_t): float32 =
v[2]
template w(v: vector4_t): float32 =
v[3]
template `x=`(v: vector4_t, val: float32) =
v[0] = val
template `y=`(v: vector4_t, val: float32) =
v[1] = val
template `z=`(v: vector4_t, val: float32) =
v[2] = val
template `w=`(v: vector4_t, val: float32) =
v[3] = val
var a: vector4_t
a.x = 1
a.y = 2
a.z = 3
a.w = 4
echo a.x
echo a.y
echo a.z
echo a.w
## Note "echo a" seems to be broken