nimforum mirror - Help with generics and typeclass problem

void09 (orginal) [2023-05-17T06:06:14+02:00] view original

type
  Units* = SomeUnsignedInt
  BitVecS*[S:static int, B:Units = uint] = object
    Base*: array[S div (sizeof(B) * 8) + int(S mod (sizeof(B) * 8) != 0), B] #8,16,32,64
  BitVec*[B:Units] = object
    Base: seq[B]
    size: int
  AnyBitVec[S:static int, B:Units] = (BitVecS[S, B] | BitVec[B])

proc `[]`*[S,B](bv:AnyBitVec[S, B], i:int): int =
  return (bv.Base[i div (B.sizeof * 8)] shr (i and (B.sizeof * 8 - 1)) and 1).int

func newBitVector*[B](size: int, init = 0): BitVec[B] {.inline.} =
  var blocks = size div (sizeof(B) * 8) + int(size mod (sizeof(B) * 8) != 0)
  result.Base = newSeqOfCap[B](blocks)
  result.Base.setlen(blocks)
  result.size = size

var bv1: BitvecS[20240001]

var y = newBitVector[uint32](1000)
discard y[0] #this fails

I am trying to write a bitvector lib that has a common implementation for static (array) and dynamic (seq) ones, with user defineable base unit of storage. For static one I need S (static size) and B (base unit), while for the dynamic one I only need B. I tried like in the code above, but when accessing any procs on the dynamic one (BitVec), I get "Error: cannot instantiate: 'S'". I thought this would work since BitVec doesn't have/need S, but no. I got around it by adding the S parameter to the type, and passing a random number as generic parameter when calling procs on it, and while this works, it's obviously not pretty, and makes me believe there has to be a way to make this work cleanly. I was told to use concepts, but I couldn't make that work either, and I really don't understand how they work even after reading the doc. But I'd like not to use them unless absolutely necessary. Also, on a side note, the default generic parameter type I have on BitVecS (B:Units = uint) seems to actually work, although I was told it shouldn't. Any idea why it works (in this context)?

demotomohiro (orginal) [2023-05-17T10:49:55+02:00] view original

This code compiles:

type
  Units* = SomeUnsignedInt
  BitVecS*[S:static int, B:Units = uint] = object
    Base*: array[S div (sizeof(B) * 8) + int(S mod (sizeof(B) * 8) != 0), B] #8,16,32,64
  BitVec*[B:Units] = object
    Base: seq[B]
    size: int
  AnyBitVec = BitVecS or BitVec

proc `[]`*(bv:AnyBitVec, i:int): int =
  type B = bv.B
  return (bv.Base[i div (B.sizeof * 8)] shr (i and (B.sizeof * 8 - 1)) and 1).int

func newBitVector*[B](size: int, init = 0): BitVec[B] {.inline.} =
  var blocks = size div (sizeof(B) * 8) + int(size mod (sizeof(B) * 8) != 0)
  result.Base = newSeqOfCap[B](blocks)
  result.Base.setlen(blocks)
  result.size = size

var y = newBitVector[uint32](1000)
echo y[0]

Furthermore, every generic type automatically creates a type class of the same name that will match any instantiation of the generic type.

https://nim-lang.org/docs/manual.html#generics-type-classes

mratsim (orginal) [2023-05-17T12:29:45+02:00] view original

Your implementation procs should use openArray[byte] and pass them the base field from the public procs.

void09 (orginal) [2023-05-17T12:48:49+02:00] view original

mratsim that might be true in general, but I want to have user-pickable uint base type for the array/seq, so openArray[byte] won't do. Also I want to have max performance, and one extra proc call is not ideal.

mratsim (orginal) [2023-05-17T20:29:48+02:00] view original

use openArray[T] then.

Also I want to have max performance, and one extra proc call is not ideal.

If your public proc is tagged inline there is no extra cost.

The tradeoffs are:

if you use openArray, both static/dynamic will use the same code

if you don't, you generate twice more code

if you don't use openArray, the compiler has an easier time for loop unrolling

if you do, it will only be able to unroll loops if you tag everything inline

Furthermore all the procedures for bit vectors are very small so tagging all inline makes sense.

Lastly, a division or modulo operation takes about 55 cycles. An addition/shift/and instruction takes 1 cycle at most. Since everything you divide or modulo with is a power of 2 use shr log2(n) for division and and (n-1) for modulo.

As mentioned on Discord, use ceil_division for sizing your bitvector:

proc ceilDiv(a, b: int) =
  (a+b-1) div b

You can adapt it to b being a power of 2

void09 (orginal) [2023-05-19T23:32:41+02:00] view original

So I don't open another topic, I'd like to know if it's possible to use std/math ceilDiv inside a (generic) type definition, like here: https://play.nim-lang.org/#ix=4wdy this does not compile, even though const a = ceildiv(..) works, so ceildiv is good for compile time. If I write my own ceildiv without generics and use that, it works

mratsim (orginal) [2023-05-20T10:17:34+02:00] view original

this compiles but I get SIGSEGV on executing

It's strange, in my experience it's the nim compiler itself that SIGSEGV

Anyway, it's a bug and should be reported in the tracker.

void09 (orginal) [2023-05-21T17:19:12+02:00] view original

You are right mratsim, it's the nim compiler that sigsegv.. which is obviously a bug, no doubt about it :) I will report it.

Speaking of bugs and types, I almost forgot this little code snippet when I was trying to figure out concepts: https://play.nim-lang.org/#ix=4wl7

Code behaves correctly but for some reason the [] proc is very slow. I get ~45 second runtime vs ~4ms what it normally should be, for 4 million operations. 10.000x slower. While the []= proc is as fast as it's supposed to be.

I lack the skills to figure out the root cause of this, maybe you or someone else with some time can try and see if this is a nim bug, or there is something wrong with my code in this context. But it looks like a bug. Using nim-devel.

mratsim (orginal) [2023-05-21T18:44:37+02:00] view original

Code behaves correctly but for some reason the [] proc is very slow. I get ~45 second runtime vs ~4ms what it normally should be, for 4 million operations. 10.000x slower. While the []= proc is as fast as it's supposed to be.

The code compiles to

typedef NU8 tyArray__3Ihb9ak9b9bUiqPPTJrVEaTqQ[500001];
struct tyObject_BitVecS__46rKYwOEk3TCNKA4N4J2aQ {
tyArray__3Ihb9ak9b9bUiqPPTJrVEaTqQ Base;
};

N_LIB_PRIVATE N_NIMCALL(void, X5BX5Deq___test95bitv_123)(tyObject_BitVecS__46rKYwOEk3TCNKA4N4J2aQ* bv, NI i, NI value) {
        NU8* w;
        w = (&(*bv).Base[((NI)(i / ((NI) 8)))- 0]);
        {
                if (!(value == ((NI) 0))) goto LA3_;
                (*w) = (NU8)((*w) & (NU8)((NU8) ~((NU8)((NU64)(((NU8) 1)) << (NU64)((NI)(i & ((NI) 7)))))));
        }
        goto LA1_;
        LA3_: ;
        {
                (*w) = (NU8)((*w) | (NU8)((NU64)(((NU8) 1)) << (NU64)((NI)(i & ((NI) 7)))));
        }
        LA1_: ;
}
N_LIB_PRIVATE N_NIMCALL(NI, X5BX5D___test95bitv_201)(tyObject_BitVecS__46rKYwOEk3TCNKA4N4J2aQ bv, NI i) {
        NI result;
        result = (NI)0;
        result = ((NI) ((NU8)((NU8)((NU8)(bv.Base[((NI)(i / ((NI) 8)))- 0]) >> (NU64)((NI)(i & ((NI) 7)))) & ((NU8) 1))));
        return result;
}

So tyObject_BitVecS__46rKYwOEk3TCNKA4N4J2aQ a 500kB object is copied by value on each access X5BX5D___test95bitv_201 but it's passed by pointer for mutation in X5BX5Deq___test95bitv_123

It's a codegen bug, however I already raised a similar one here https://github.com/nim-lang/Nim/issues/16897 which was supposed to be fixed.

Anyway you can raise the bug.

Mirror of forum.nim-lang.org

10202 :: Help with generics and typeclass problem