nimforum mirror - Data-oriented design, interleaved/noninterleaved data batches (AoS, SoA), worker threads etc.

deorder (orginal) [2019-05-11T03:58:59+02:00] view original

What would be the equivalent of the following code in Nim (especially the rigidbody_t structure and physics_integrate function)?

https://pastebin.com/gXkBtHCE

I want to create job functions running on (long-living) worker threads (#cores - 1) that can act on batches of data (Structure of Arrays or Array of Structures aligned to cacheline) to reduce cache misses.

Is there an easy way to do this in Nim?

Araq (orginal) [2019-05-11T10:29:36+02:00] view original

Sure.

type
  vector4_t* {.bycopy.} = object
    x*: cfloat
    y*: cfloat
    z*: cfloat
    w*: cfloat


const
  RIGIDBODY_SIZE* = 64

type
  rigidbody_t* {.bycopy.} = object
    len*: int
    position*: array[RIGIDBODY_SIZE, vector4_t]
    velocity*: array[RIGIDBODY_SIZE, vector4_t]


proc physics_integrate*(rigidbody: rigidbody_t) =
  ##  Load
  let positions = rigidbody.position
  let velocities = rigidbody.velocity
  ##  Transform
  for i in 0..< rigidbody.len:
    ##  Load
    let position = positions[index]
    let velocity = velocities[index]
    ##  Transform
    let result = vector4_add(position, velocity)
    ##  Store
    positions[index] = result
  ##  Store

deorder (orginal) [2019-05-11T12:45:34+02:00] view original

Nice. That looks really good :D

mratsim (orginal) [2019-05-11T14:03:44+02:00] view original

The alignment pragma at type level is in heavy development at https://github.com/nim-lang/Nim/pull/11077

Otherwise you have to declare all your variable like so

{.pragma: align16, codegenDecl: "$# $# __attribute__((aligned(16)))".}
let a {.align16.} = [float32 1.0, 2.0, 3.0, 4.0]

I suggest you use one of the following to enforce alignment at the moment:

# X86-only
when defined(vcc):
  {.pragma: x86_type, byCopy, header:"<intrin.h>".}
  {.pragma: x86, noDecl, header:"<intrin.h>".}
else:
  {.pragma: x86_type, byCopy, header:"<x86intrin.h>".}
  {.pragma: x86, noDecl, header:"<x86intrin.h>".}
type
  vector4_t* {.importc: "__m128", x86_type.} = array[4, float32]

template x(v: vector4_t): float32 =
  v[0]
template y(v: vector4_t): float32 =
  v[1]
template z(v: vector4_t): float32 =
  v[2]
template w(v: vector4_t): float32 =
  v[3]
template `x=`(v: vector4_t, val: float32) =
  v[0] = val
template `y=`(v: vector4_t, val: float32) =
  v[1] = val
template `z=`(v: vector4_t, val: float32) =
  v[2] = val
template `w=`(v: vector4_t, val: float32) =
  v[3] = val


var a: vector4_t

a.x = 1
a.y = 2
a.z = 3
a.w = 4
echo a.x
echo a.y
echo a.z
echo a.w

## Note "echo a" seems to be broken

# GCC/CLang only but works on X86, ARM, MIPS ...
{.emit:"""
typedef float vector4_t __attribute__ ((vector_size (16)));
"""
}

type vector4_t {.bycopy, importc.} = array[4, float32]

template x(v: vector4_t): float32 =
  v[0]
template y(v: vector4_t): float32 =
  v[1]
template z(v: vector4_t): float32 =
  v[2]
template w(v: vector4_t): float32 =
  v[3]
template `x=`(v: vector4_t, val: float32) =
  v[0] = val
template `y=`(v: vector4_t, val: float32) =
  v[1] = val
template `z=`(v: vector4_t, val: float32) =
  v[2] = val
template `w=`(v: vector4_t, val: float32) =
  v[3] = val


var a: vector4_t

a.x = 1
a.y = 2
a.z = 3
a.w = 4
echo a.x
echo a.y
echo a.z
echo a.w


## Note "echo a" seems to be broken

deorder (orginal) [2019-05-11T15:52:56+02:00] view original

Yes I figured that out, tnx. Nice that I can I can actually do this in Nim :D

Mirror of forum.nim-lang.org

4833 :: Data-oriented design, interleaved/noninterleaved data batches (AoS, SoA), worker threads etc.