Hi,
I'm new to Nim and am working on a parser and interpreter for a pet language. I wonder how I should define the generic type that can represent all possible values my language supports. It should satisfy a few criteria:
Please let me know your thoughts or suggestions. Thank you very much!
Gene =
int or
string or
seq[Gene] or
Table[string, Gene]
GeneKind* = enum
GeneInt
GeneString
GeneSeq
GeneMap
Gene2* {.acyclic.} = ref object
case kind*: GeneKind
of GeneInt:
num*: Int
of GeneString:
str*: string
of GeneSeq:
vec*: seq[Gene2]
of GeneMap:
map*: Table[string, Gene2]
Gene is unboxed. Any int behaves like Gene. An array of Gene is tightly packed. Any proc of Gene will instantiate at compile time to the concrete types of Gene. The seq[Gene] or the table must contain one concrete type. There is zero runtime overhead. You have zero type safety.
Gene2 is boxed. Any x of type int will have to be wrapped Gene2(kind:GeneInt, num:x). An array of Gene2(kind:GeneInt,...) is much larger than an array of int. Any proc of Gene2 will need to check the kind at runtime. The seq or the table can contain mixed types, int, string, seq, or table wrapped under Gene2. That contributes runtime overhead. You have type safety.
There is also one middle ground. You get type safety, without heterogeneity.
Nim
type
GeneInt = distinct int
GeneString = distinct string
GeneSeq = seq[Gene3]
GeneMap = Table[string, Gene3]
Gene3 = GeneInt | GeneString | GeneSeq | GeneMap
I have the feeling, possibly biased, that you don't really understand what you are asking for. It will be really useful if you start with concrete types, and write all the proc for each type separately. You can design the generic interface later by calling concrete proc.
TLDR: If you only want a uniform interface, but not type safety, go with Gene. If, in addition to a uniform interface, you want type safety, go with Gene3. If you really want heterogeneous arrays and don't care about the runtime overhead, go with Gene2.
Whilst the syntax of type classes appears to resemble that of ADTs/algebraic data types in ML-like languages, it should be understood that type classes are static constraints to be enforced at type instantiations. Type classes are not really types in themselves, but are instead a system of providing generic "checks" that ultimately resolve to some singular type. Type classes do not allow for runtime type dynamism, unlike object variants or methods.
There you have it: object variants (Gene2) or methods ... i.e., define a base class derived from RootObj, and then define your various object types derived from the base; you then specify different behaviors for the different object types by defining methods (like proc but with the method keyword instead) taking the different types as arguments ... this is the standard object-oriented approach.