nimforum mirror - Which is the preferred way to define a generic type?

gcao (orginal) [2020-08-02T17:00:29+02:00] view original

Hi,

I'm new to Nim and am working on a parser and interpreter for a pet language. I wonder how I should define the generic type that can represent all possible values my language supports. It should satisfy a few criteria:

Low overhead when accessing its value

Possible to define different behavior for different type branches (int vs string etc)

Please let me know your thoughts or suggestions. Thank you very much!

Gene =
    int or
    string or
    seq[Gene] or
    Table[string, Gene]
  
  GeneKind* = enum
    GeneInt
    GeneString
    GeneSeq
    GeneMap
  
  Gene2* {.acyclic.} = ref object
    case kind*: GeneKind
    of GeneInt:
      num*: Int
    of GeneString:
      str*: string
    of GeneSeq:
      vec*: seq[Gene2]
    of GeneMap:
      map*: Table[string, Gene2]

treeform (orginal) [2020-08-03T01:30:07+02:00] view original

I would do Gene2.

jxy (orginal) [2020-08-03T22:48:06+02:00] view original

Gene is unboxed. Any int behaves like Gene. An array of Gene is tightly packed. Any proc of Gene will instantiate at compile time to the concrete types of Gene. The seq[Gene] or the table must contain one concrete type. There is zero runtime overhead. You have zero type safety.

Gene2 is boxed. Any x of type int will have to be wrapped Gene2(kind:GeneInt, num:x). An array of Gene2(kind:GeneInt,...) is much larger than an array of int. Any proc of Gene2 will need to check the kind at runtime. The seq or the table can contain mixed types, int, string, seq, or table wrapped under Gene2. That contributes runtime overhead. You have type safety.

There is also one middle ground. You get type safety, without heterogeneity.

 Nim
type
  GeneInt = distinct int
  GeneString = distinct string
  GeneSeq = seq[Gene3]
  GeneMap = Table[string, Gene3]
  Gene3 = GeneInt | GeneString | GeneSeq | GeneMap

I have the feeling, possibly biased, that you don't really understand what you are asking for. It will be really useful if you start with concrete types, and write all the proc for each type separately. You can design the generic interface later by calling concrete proc.

TLDR: If you only want a uniform interface, but not type safety, go with Gene. If, in addition to a uniform interface, you want type safety, go with Gene3. If you really want heterogeneous arrays and don't care about the runtime overhead, go with Gene2.

jibal (orginal) [2020-08-04T05:48:51+02:00] view original

Gene will not work, nor will Gene3 above ... you cannot build a runtime dynamic type out of generics, which are a compile-time feature. I advise reading and understanding what the manual says about type classes:

Whilst the syntax of type classes appears to resemble that of ADTs/algebraic data types in ML-like languages, it should be understood that type classes are static constraints to be enforced at type instantiations. Type classes are not really types in themselves, but are instead a system of providing generic "checks" that ultimately resolve to some singular type. Type classes do not allow for runtime type dynamism, unlike object variants or methods.

There you have it: object variants (Gene2) or methods ... i.e., define a base class derived from RootObj, and then define your various object types derived from the base; you then specify different behaviors for the different object types by defining methods (like proc but with the method keyword instead) taking the different types as arguments ... this is the standard object-oriented approach.

xigoi (orginal) [2020-08-04T21:29:20+02:00] view original

I think you want a sum type (not a generic type), and Gene2 is the idiomatic way to make sum types in Nim.

gcao (orginal) [2020-08-05T15:39:47+02:00] view original

Thank you all for the recommendations and the explanations. As @jxy said, I don't know a lot about the different options I put in the original question. I'll go back to read documentation and your posts and see which way to go. I already wrote a lot code using option 2 but if it's not the right approach, I'm open to start from scratch. Assuming you are the one implementing a dynamic general purpose language in Nim, and you probably need a way to represent all possible values (literal values, arrays, maps, native values etc), what will you pick?

gcao (orginal) [2020-08-05T16:19:32+02:00] view original

@jibal, if I create a class hierarchy and uses a child class to represent int values etc, will the performance be comparable to Gene2?

jibal (orginal) [2020-08-06T22:21:25+02:00] view original

I suspect it would be close, since the Nim compiler works hard to do things efficiently, but you would have to benchmark it. However, @treeform, @xigoi, and I have all recommended the Gene2 (object variants), so maybe start with that and see if it works for you. :-)

shirleyquirk (orginal) [2020-08-07T01:32:43+02:00] view original

Fwiw, both Nim itself, and the Nim make-a-lisp implementation use Gene2

Mirror of forum.nim-lang.org

6633 :: Which is the preferred way to define a generic type?