nimforum mirror - RTTI for objects

Sixte (orginal) [2015-04-13T17:44:12+02:00] view original

I have to walk through a tree to the root, looking for identifiers (with specific scope). There are different types of nodes, but only one (or two) types contain the required identifiers (packed in seq types). I could use methods, but I want to avoid it (performance). I can't use the inferred type (via declaration in the function head), so I have to avoid getTypeInfo in the searching proc. Instead, I get the information from the stored object with a cast (revealed type). The type info is compared with the type info of an "example object". If they match, the search has to continue in the dynamic array (seq). If not, the loop iterates further in the direction of the root.

It seems to work... But I think, may it better (and perhaps more efficiently) be done with special NIM functions suitable for RTTI? I could not find some appropriate functions in marshal.nim.

type
  vkpair = tuple[nm, nsym: int]
  rgnode = ref gnode
  pgnode = ptr gnode
  gnode  = object of RootObj
    top:   pgnode
  rlnode = ref lnode
  plnode = ptr lnode
  lnode  = object of gnode
    cn:    int
    llist: seq[vkpair]
  tgetret= tuple[cn: int, index: int]
  tcast  = tuple[a: int]

let xlnode = lnode(cn:0)
let reft   = cast[int](xlnode.getTypeInfo)

proc getnm(vp: pgnode, nm: int): tgetret =
  var p = vp
  while p != nil:
    let tref = cast[tcast](p[])
    if tref.a != reft:
      p = p.top
      continue
    let q = cast[plnode](p)
    if q.llist == nil:
      p = p.top
      continue
    let narg = q.llist.len - 1
    for i in 0 .. narg:
      if q.llist[i].nm == nm:
        return (retp: q.cn, nm: i + 1)
    p = p.top
  return(retp: 0, nm: 0)

proc gbind(ptop,p : rgnode) =
  p.top = cast[pgnode](ptop)

var x1 = rlnode(cn: 7)
var x2 = rlnode(cn: 9)
var x3 = rgnode()
var x4 = rlnode(cn: 23)

x1.llist = newSeq[vkpair](8)
x2.llist = newSeq[vkpair](3)
gbind(x1,x2); gbind(x2,x3); gbind(x3,x4)
x1.llist[4] = (99,0)
x1.llist[3] = (42,0)
x2.llist[1] = (99,0)
echo getnm(cast[pgnode](x4),99)
echo getnm(cast[pgnode](x4),42)

Any help or further information welcome.

Varriount (orginal) [2015-04-14T07:58:03+02:00] view original

This is, I think, a case where composition is to be favored instead of inheritence. Instead of having all those different node types, why not have a single node type with variant clauses, and use cases to decide what to do?


type
  NodeKind = enum
    nkG, nkL

. VkPair = tuple[nm, nsym: int]
  
  Node = object
    case kind: NodeKind
    of nkG:
      top*: ref Node
    of nkL:
      cn: int
      llist: seq[VkPair]

I'll type more in the morning, when I'm not hampered by a phone.

Sixte (orginal) [2015-04-14T09:51:15+02:00] view original

The tree represents the data structure of a type inference machine (polymorphic, not Hindley-Milner). I need a small memory footprint. The variant-record-model (introduced by Pascal) is nice. But it places the max. required record size on the heap.

So, my first question was : how to use the Nim typeinfo instead of using the little hack for the concrete type.

The hack was

let reft   = cast[int](xlnode.getTypeInfo) # we get the type of lnode

let tref = cast[tcast](p[]) # we get the type of p[] via "hack"
  # matching against the type of the given node
  if tref.a != reft:

Ugly, but I am new on Nim and did'nt know it better.

What I am really looking for, is a variant object which composes a fixed-sized head (some pointers) with an array with fixed size (no reallocation necessary) , but not known at compile time. Instead, the size of the array will be known at creation on the heap. The size of the array is defined by: mem_obj - mem_head. mem_obj is available at any time (unavoidably needed for allocation and deallocation). mem_head is constant and defined at compile time.

How to implement such a variant object in Nim? Since the GC can handle seqs, the GC also could handle vobjects I think.

filwit (orginal) [2015-04-14T13:07:18+02:00] view original

I'm not sure exactly what you're going for, but have you tried using the of operator? Example:

type
  Pet = ref object of RootObj
    name: string

type
  Cat = ref object of Pet
    lives: int
  
  Dog = ref object of Pet
    owner: string

var pets: seq[Pet] = @[
  Cat(name:"Mitten", lives:9),
  Dog(name:"Sparky", owner:"Joe")
]

for pet in pets:
  if pet of Cat:
    let cat = Cat(pet)
    echo "Cat: ", cat.name, ", ", cat.lives
  elif pet of Dog:
    let dog = Dog(pet)
    echo "Dog: ", dog.name, ", ", dog.owner

Sixte (orginal) [2015-04-14T13:36:15+02:00] view original

You are right. In my case, the

if not (p of plnode):
# ...
let q = plnode(p)
# ...

made it. Thank You. However, the question about "vobject things" : objects with variable size, like

vobj = object
  fx:  fixdata
  va:  array[0 .. (*),fixdata2]

remains.

Araq (orginal) [2015-04-14T13:38:39+02:00] view original

There is limited support for that via system.unsafeNew.

Sixte (orginal) [2015-04-14T13:47:58+02:00] view original

Nice. Where is unsafeNew implemented? :D

I only could find:

# 'genNew' also handles 'unsafeNew':

So it is hidden elsewhere in the Compiler...

EDIT: And unsafeNew works too. VERY nice. Now, I have to find the "real" size in the data structure of the ref pointer....

Jehan (orginal) [2015-04-14T15:18:10+02:00] view original

Okay, there are basically two ways to do polymorphism with late binding in Nim. You can either use object variants or you can use inheritance. Both have their pros and cons and there are implementation details that affect either that you should be aware of.

As you correctly note, object variants may waste memory. This is because the discriminator field is mutable and (at least hypothetically) the different branches can be switched between at any time, so you need to reserve the maximum amount of memory needed. This could in principle be avoided by having a pragma that makes the discriminator immutable.

However, the memory wasted is generally very little and nothing to worry about. If you do have a type where the different branches have vastly different requirements, a solution that does not waste memory (assuming the underlying structure is a tree, not a DAG) is a value type with ref fields (or single word fields where that's appropriate), e.g.

type Node =
  object
    case kind: SomeEnum
    of kind1: value1: int
    of kind2: value2: ref array[0..1023, int]
    of kind3: value3: seq[Node]
    of kind4: value4: ref SomeOtherType
    of kind5: value5: string
    of kind6: nil

This type will always fit into two words of memory (well, on all architectures that I know of).

This can even be more efficient, as it sometimes avoids memory allocations (at the cost of the CPU having to shuffle two words around instead of one, though you can use the {.byRef.} pragma to minimize that if it turns out to be a problem for a given architecture).

Alternatively, you can use inheritance. This allows for you to use either methods or the of operator to check for performance. I wouldn't personally be too concerned about the performance of method dispatch, but you may want to benchmark it for yourself if you are unsure (I doubt that it's going to be critical except for the most performance-sensitive work).

Unlike, say, C++, Nim does not (currently) use vtables to implement method dispatch. Rather, it does something like (the following is pseudocode):

if obj.type == subtype1: f_for_subtype1(obj)
elif obj.type == subtype2: f_for_subtype2(obj)
...

Depending on your processor and the number of subtypes, this can be either slower or faster than vtables (it's practically never hugely slower or faster). It basically depends on how good the processor is at predicting computed gotos; the most recent Intel processors have become pretty good at doing that, but even on some other fairly recent CPUs, the branching approach can be faster.

Also, if you know what types are most common, you can use of to optimize dispatch further.

Note that the underlying reason for not using vtables is that Nim supports multiple dispatch; vtables (at least in their basic form) are insufficient for implementing multiple dispatch. This could be optimized, however, by implemeting something like Scala's sealed classes (which cannot be inherited from outside the current module). Sealed classes allow for exhaustiveness checks of types and more efficient code generation.

I recommend against using type information explicitly. It will just be a slower way of doing what method dispatch and the of operator already do.

Sixte (orginal) [2015-04-15T14:04:59+02:00] view original

Thank You for your explanations, Jehan.

The vtables (introduced by Simula, if I remember it correctly) , overly used by C++, were a mistake. It took a long time to recognize that.

Jehan (orginal) [2015-04-15T16:20:04+02:00] view original

Sixte: The vtables (introduced by Simula, if I remember it correctly) , overly used by C++, were a mistake.

That's not what I meant to say. Vtables are an implementation detail, not a language feature. They are not always the best choice of implementation to implement dynamic dispatch, but they're definitely not a mistake. On my current laptop, for example, they'd outperform Nim's current dispatch mechanism.

Sixte (orginal) [2015-04-15T17:24:09+02:00] view original

This was not my point. I think, it depends on the numbers of comparisons which have to be performed. 2 or 3 - it's nice, but 10 ore more will slow down the dispatch critically. But there are other advantages of dynamic dispatch via RTTI:

In a language which may or will be a serious competitor to Nim in the distant future (presumed that some major overhauls will be performed) , you can write:

use std::f64;
enum Shape {
   Circle { center: Point, radius: f64 },
   Rectangle { top_left: Point, bottom_right: Point }
 }
 fn area(sh: Shape) -> f64 {
   match sh {
       Circle { radius: radius, .. } => f64::consts::PI * square(radius),
       Rectangle { top_left: top_left, bottom_right: bottom_right } => {
           (bottom_right.x - top_left.x) * (top_left.y - bottom_right.y)
       }
   }
}

(a language cumbersome verbose..., it's so sad (and it becomes even more sad when we add lifetime variables)...)

but anyway, we match different object (sub)types against the incoming supertype and then we can use specific procs with different type signatures. This allows inlining too and one could extend it further for types/objects with type parameters. (Furthermore, the types of the locally introduced identifiers in the match items are inferred automatically, not so bad!)

In a vtab, all functions having the same name need to have the same type signature. They hide the concrete abstract type perfectly. We can instantiate objects with type parameters - but we can't use the type parameters in the (vtab) type interface. Unfortunately, I don't know a simple solution for that...

Mirror of forum.nim-lang.org

1126 :: RTTI for objects