While in Nim has in general a pleasant syntax, if there is one obvious area it can improve then it is the tagged enum syntax. Right now it is kind of complicated and goes against the ethos of Nim which is concise syntax.
This post explains it quite well.
https://forum.nim-lang.org/t/904#5441
As you can see the current syntax is a bit too close "to the metal" and a more abstract syntax would be nice. Rust popularized this, Swift made it more usable and tagged enums are not going away.
Example would be:
variant Foo:
Alpha(i: int)
Beta(f: float)
Gamma(s: string)
let x= Foo.Beta(f: 1.0) # or just Beta if no collisions
case x
of Alpha(i):
echo(i: i)
of Beta(f: f):
echo(f)
of Gamma(s: s):
echo(s)
Another thing of improvement is adding lowering for the optional type, just like in Swift.
var x: string?
becomes
var x: Option[string]
Optional chaining similar to Swift and C#:
https://docs.swift.org/swift-book/LanguageGuide/OptionalChaining.html
In Swift, optional chaining is also just lowering.
Any plans for this?
Before someone gives the stock answer of "you can just use macros": I agree that this stuff should be easier to write. And there is no popular or standard macro library that lets you do this, despite not being very hard to implement. However the existing way of doing it in the language should remain intact and distinct from the sugar. For example case, which allows constant expressions in its of branches, should not intersect with whatever is done here, where a "pattern" expression is required.
Beyond that though there is tons of discussion on this and unfortunately I don't have a compilation of it but you can look for it at https://github.com/nim-lang/RFCs/issues?q=is%3Aissue+is%3Aopen+pattern+matching or https://github.com/nim-lang/RFCs/issues?q=is%3Aissue+is%3Aopen+variant
Sidenote: Even if object variants are too "low level" for people, there's nothing wrong with the syntax IMO, and fits the idea pretty well. Type sections in general though do have a dissonant syntax with the rest of the language, like how object needs to be indented but doesn't allow a colon, or doesn't allow semicolons between fields like named tuple types and proc arguments do, or how enums allow mixing between whitespace and commas to separate enum fields (which is nice but not pretty).
I don't mind the syntax of case objects and it works better than Rust's and Swift's solutions, all things considered.
A macro could be added to sugar.nim for the people who disagree and want more sugar.
I'm not a fan of a macro for this because my code simply contains too few object cases for it to matter. Syntax shortcuts should exist for common things.
There is std / wrapnils for chaining nullables etc and it works better than Swift's solution IMHO.
I'm generally satisfied with patty macro. Its in the first nimble query you gave and provides both the declaration and a match syntax:
https://github.com/andreaferretti/patty#constructing-variant-objects
variant Shape:
Circle(r: float)
Rectangle(w: float, h: float)
UnitCircle
let coord = match c:
Circle(x: x, y: y, r: r):
x
Rectangle(w: w, h: h):
h
I'd prefer something that can integrate with the existing syntax instead of requiring a standalone block.
type
Foo {.variant.} = object
case kind
of Alpha:
i: int
of Beta:
f: float
of oamma:
s: string
I don't mind the syntax of case objects and it works better than Rust's and Swift's solutions, all things considered.
there is a significant downside of nim that frequently happens when working with case object: you cannot initialize a case object with the case data only, you need to instantiate the "shell" type:
type X = object
case x: enum
of valueA: a: int
of valueB: b: int
let x = valueB(b: 42) # doesn't work - needs `X(x: valueB, b: int)`
in the above, there's no way to create an X referring only to the enum and the "members" it has - you need to involve X which is problematic when X is generic - this prevents things like myVariant == valueA(a: 42) which is a significant problem when X is generic, for example Result[T, E].
Consider:
func f(): Result[int, string] =
return Result[int, string](isOk: true, value: 42)
In the above example, it's uninteresting when returning an "ok" value what the "error" branch is - ditto comparisons and other frequently hit use cases of variant objects (the same applies to Optional in std).
This isn't "solveable" with case variants objects simply because they are overly loose: they allow members "outside" of the case, or indeed multiple case sections whereas a "pure" enum object has only one "selection point" and therefore can afford a more pleasant experience when using it.
my code simply contains too few object cases for it to matter.
This is a signal: the current case objects are not that useful due to their inherent limitations - that's why you don't see them used very often. A tagged enum like proposed above would likely see a lot more use.
you cannot initialize a case object with the case data only, you need to instantiate the "shell" type
On the other hand the advantage of Nim case objects is that tags are first class. So you can pass only a tag to a function or change a tag if you remain in the same branch.
A tagged enum like proposed above would likely see a lot more use.
Why is that a good thing? Tag usually means you need branching => so it will be slow if you have lots of them.
This is a signal: the current case objects are not that useful due to their inherent limitations - that's why you don't see them used very often.
No, it's because a type section is used 1390 times in the stdlib whereas proc is used 10741 times.
This isn't "solveable" with case variants objects simply because they are overly loose
Well this is solvable if you assign a field that can only occur in a single delimited branch then it can infer the value you want to supply. Otherwise it could error the possible values. This is also not commentary on the actual types but how they're constructed, which I would argue discredits your point.
that's why you don't see them used very often
That seems like a purposely biased sentiment, that has no evidence. Almost every complex library will use them at least once, sometimes even more!
I do think object variants have an ergonomics issue, but it's mainly on the declaration. Manually creating an enum per branch and not just emitting an Enum like @xigoi has demonstrated is the main crux in my view (Ostensibly a NodeKind should be declared with a Node type...).
therefore limit the ability to reason about them in generic code, macros, etc. This is where the lack of ergonomics comes from
I still think this is putting the cart before the horse. It's not hard to imagine a world where what you want works in a world with Nim object variants. With a change to the compiler all of the following could be valid
type X = object
case x: enum
of valueA: a: int
of valueB: b: int
match X()
of X(@a):
echo a
of X(@b):
echo b
let x = X(b: 42)
type MyResult = Result[int, string]
func f(): MyResult = MyResult(value: 42)
type MyComplexType = object
a, b: string
case c: bool
of true:
d, e: int
of false:
case otherField: 0..3 # Too lazy for an enum here
of 0, 1:
f, g: float
of 2:
h: string
else:
discard
match MyComplexType(h: "hello") # Hey the compiler can reason this!
of MyComplexType(@h):
echo h
of it = MyComplexType(@f, myField = @g):
echo it, " ", f, " ", myField, " ", it.otherField
else: discard
var a = MyComplexType(f: 0) # Error 'otherField' can be `0` or `1`
Note that constraining objects to a single case and no "extra" fields (like tagged enums do) leads to no loss of generality in what you can express
It does lead to a loss of efficiency though as in many important cases the discriminator is a single byte that can be combined into a word with some "flags" field. But that is not possible when the object is deconstructed into a tuple of sum types.
It does lead to a loss of efficiency though as in many important cases the discriminator is a single byte that can be combined into a word with some "flags" field.
How do you mean? Because of alignment, or by doing magic optimizations? The latter would have ABI implications, and there are two cases:
I have a preference for the former, in general - it would be nice if objects could be tagged "abi: c" in which case they follow the (fairly) well-established C ABI, otherwise leaving the compiler to reorder and optimize as it sees fit - this ABI freedom would be a huge benefit to nlvm when it comes to efficiency tricks like this - the C backend could also do many of them.
Due to alignment. And it's hard to gain it back because the sum type is reified and can be used independently from where it is embedded. Consider:
type
SomeEnum = enum
strVal, intVal, nothing
Node = object # size: 3 words
flags: uint8
case e: SomeEnum # merged with flags into a machine word
of strVal:
s: string
of intVal:
i: int
else:
discard
vs.
type
Branches = enum
strVal(s: string)
intVal(i: int)
nothing
Node = object # size: 4 words
flags: uint8
b: Branches
In theory you can flatten it. In practice there will be code that uses the Branches type which implies you have to unflatten it sometimes which makes the optimization much less useful.
Sorry to dredge up a two-week old post, just wanted to say I think this is a very important topic
Particularly for anyone who wants to write things like interpreters, expression/query languages, etc. Having an ergonomic representation for ADT's makes a world of difference there.
In theory you can flatten it.
Not only flatten, but also reorder the fields (by size roughly) - ie these are two "common" optimizations outside of C/C++ that I think we could adopt in Nim, but that would require said ABI feature.
In practice there will be code that uses the Branches type
Compared to the status quo, this is a new capability that you gain when you have split the type - ie existing code cannot do this simply because the code is not factored that way, and if you do factor the code this way (because you want to be able to write functions for the "branches" part alone), you already have to create a separate type, and thus run into the same problem.
Basically, we can add tagged types to the language without removing case objects - the latter would serve for the special case that you outline, until we get ABI flexibility.
Particularly for anyone who wants to write things like interpreters, expression/query languages, etc. Having an ergonomic representation for ADT's makes a world of difference there.
Well I'm one who has written things like interpreters, expression/query languages and not one "who wants to". And let me tell you: No. It does not make a world of difference when you already have case objects.