nimforum mirror - Nim language aspects, that I don't learn to like

Krux02 (orginal) [2016-03-09T23:24:28+01:00] view original

In the past months, I've done some serious Nim programming including writing a lot of macros in nim, and if you are interested in what I am working on, here is a link. Don't get me wrong because of this list, I really like Nim, it does a lot of things right, and I will continue using nim, it should rather show, that I do care about how Nim develops:

Identifier equality

I am already not a fan of case insensitive languages, because the style of using capital letters and lower case letters varies very much from programmer to programmer, and things can become hard to read for no good reason, but style insensitive takes this to another level. First of all, it creates name conflicts in areas where it's not necessary, for example in opengl there is the type GLfloat, and the constant GL_FLOAT, these names can't be forwarded to nim, because they are equal the the rules of identifier equality. Then the second reason I do not like this design decision is, that it breaks compatibility with already existing general purpose tools like grep. For grep there has been introduced a solution, called nimgrep. But this doesn't solve the problem, I already made myself used to the program called ag, and ag has not been fixed, I have to make myself used to another tool depending on weather I am in a Nim project or not. Also the search function in common editors don't work that well anymore. identify as a recap this is the identifier equality function from the documentation

proc sameIdentifier(a, b: string): bool =
  a[0] == b[0] and
    a.replace(re"_|–", "").toLower == b.replace(re"_|–", "").toLower

move semantics/destructors/RAII

I come from modern c++, and I really like the unique_ptr type, and all the stuff related to move semantics. It really fit's nicely into the language, and it allows me to describe the ownership of objects with the support of the language. All my objects get removed, no memory leaks and no garbage collection overhead, all I need to do is understand the concept of it, and add a few std::move here and there, and that's it. For C library functions that return pointers with ownership I can simply write a simple wrapper that returns std::unique_ptr, and from there on I can't forget to free the resources anymore.

Nim has destructors, but this quote in the documentation makes me hesitate to use them

Destructors are still experimental and the spec might change significantly in order to incorporate an escape analysis.

the void type

I've been programming in scale for quite some time, and the Unit type really made a lot of things nicer. Everything has a result type, and is therefore an expression. Every loop or if statement has a type, and can therefore be used in another expression. This makes generic programming much easier, because there is no such thing as a procedure without a return value. There is only the return value Unit that doesn't hold any information. This allows to program without caring about void return type as a special case.

proc foo( s : string ) : int = s.len
proc foo( a : float ) :  string = $a
proc foo( i : int   ) : void =
  echo "doing nothing in foo"

proc bar[T](arg : T) : void =
  echo arg.foo

bar("hallo")
bar(123.456)
bar(7) # doesn't work

proc default to void, not auto

proc square(x : float) = x * x          # doesn't work
proc square(x : float) : float = x * x  # does work

this is quite contrary to variable initialization, where when the type is left out, type inference is used:

let square = x * x           # does work
let square : float = x * x   # does work

Just for comparison in scala it does default to type inference.


def square(x : float) = x * x           // does work
def square(x : float) : float = x * x   // does work, too

loops and iterators, algorithms for collections

Nim has quite algorithms for the seq type in sequtils, but they are hard wired to work with the seq data type, I can't use these procedures when I am using any other type of container. In contrast, the c++ algorithm header introduces some generic procedures that will work an any collection as long as they provide compatible begin and end iterators. This is also by far not perfect, but at least the algorithms provided work on basically every collection.

Here is a nice talk about something that can be done in c++, and I think it should be approached in nim, too: STL Concepts and Ranges

seq and string init with nil

The seq and the string type work in my experience exactly like the c++ equivalent (std::vector, std::string), but with the exception that they can be nil. I do not understand why they can be nil, I wish the documentation would explain here more about the internals, like the go documentation does about the slices. The default value of nil caused several segmentation fauls in my code, even in macros, where it is harder to chatch them. I would like if the compiler would force me to initialize variables of these types to either explicitly nil, or an explicit value. In the best case, I would like to have a compiler that doesn't allow me to set these types to nil.

seq doesn't expose capacity

In c++ an std::vector consists of three pointers, to give these methods meaning: begin, end, capacity. In Nim it's not much different, just that I am not allowed to see the capacity. I did some ugly casting to be able to see the capacity eventually, but I wish that would not have been necessary, and I wish I can get a capacity proc for the seq type.

semantic whitespace

At first I thought, I would get used to the semantic whitespace, since there are a lot of python programmers, who seem to like it, and I agree, it is shorter in a lot of cases, since there is no line required for closing curly braces }. But there are some things that just don't work, and that is when the result of some code block needs to be passed to a procedure


def foo(s : String) = s + s

val x = 7
val xx = foo(
  if( x % 2 == 0) {
    "even"
  } else {
    var s = "o"
    s += "dd"
    s
  }
)

here is the broken version of it in nim:

proc foo(s : string) : string = s & s

let x = 7
let xx = foo(
  if x mod 2 == 0:
    "even"
  else:
    var s = "o"
    s = s & "dd"    # column(5) Error: ')' expected
    s
)

algebraic types / pattern matching

I've used the nim version of the tagged union, and no it's nothing in comparison to the algebraic types of rust, or the programmable pattern matching of scala. Even the type switch in go seems to be nicer, because it doesn't allow me to access non-accessable members at compile time.

just a simple example in scala:



case class MyTypeA(a:Int, b:Int, c:Int)
case class MyTypeB(s1:String, s2:String)

def foo( a : Any ) {
  a match {
  case MyTypeA(1,2,3) =>  println("123")  // allows exact matches
  case MyTypeA(a,b,c) =>  println(a + b + c) // allows naming
  case MyTypeB(s1, s1) => println(s1) // allows more complicated matches
  case MyTypeB(s1, s2) if s1.length == s2.length => println(s1 + s2) // allows arbitrary conditions
  case x @ MyTypeB => println(x.s1 + x.s2) // allows to match just for the type
  }
}

zielmicha (orginal) [2016-03-10T00:00:25+01:00] view original

I share most of these opinions. Especially, "void" not being a real type is annoying, requiring a lot of special cases in generic code.


proc foo[T](v: T) = discard

foo() # doesn't work
foo[void]() # works

OderWat (orginal) [2016-03-10T00:03:57+01:00] view original

Just to pick one out: Identifier equality ... Mostly because my brain just wants to see things as equal if they actually are!

Araq (orginal) [2016-03-10T00:16:33+01:00] view original

Lots of points I can understand (I don't agree with most of them fwiw), but I cannot see where this example comes from:

proc foo( s : string ) : int = s.len
proc foo( a : float ) :  string = $a
proc foo( i : int   ) : void =
  echo "doing nothing in foo"

proc bar[T](arg : T) : void =
  echo arg.foo

bar("hallo")
bar(123.456)
bar(7) # doesn't work

Why should it work? The code makes no sense.

Especially, "void" not being a real type is annoying, requiring a lot of special cases in generic code.

unit would not even remove the parameter in the first place, so I don't see the point. Feel free to use type Unit = tuple[] instead.

Araq (orginal) [2016-03-10T00:21:26+01:00] view original

Mostly because my brain just wants to see things as equal if they actually are!

My brain wants to focus on other things than hunting foo_bar vs foobar or other such nonsense. Ever wondered why the typo "their" instead of "they're" is so widespread? Because human brains internalize the sound of words, not their spelling.

OderWat (orginal) [2016-03-10T00:39:49+01:00] view original

You would need to implement phonetic-fuzzy matching to support your logic.

In fact I find it much easier to stick to "one version" .. besides that I don't read my code loud and I am pretty sure that I do not internalize words in code by their sound. I don't even know how to speak some of the words I use all day in my code. I make them up from short patterns of letters and numbers.

BTW: The brain is also the cause of most errors we want to even out in using computers. Because that is what computers do well. I don't want it to guess what I mean, I want it to obey my code to the point.

Araq (orginal) [2016-03-10T01:58:35+01:00] view original

The brain is also the cause of most errors we want to even out in using computers. Because that is what computers do well.

I don't see any connection between "Cannot find .Bashrc" and "evening out errors". Heck, lots of languages don't have the crazy distinction between upper case letters and lower case letters at all. So these languages cannot "even out errors"? Bizarre.

I don't want it to guess what I mean, I want it to obey my code to the point.

Yeah, that doesn't make any sense. Most of the time it's you who has to obey to the names somebody else chose. And often enough the names are horseshit ("dot bash reference counter? wtf?")

OderWat (orginal) [2016-03-10T02:37:19+01:00] view original

I can only say that I did not miss this feature in 34 years of programming. Also I am not at all feeling "dominated" by other peoples naming convention as long as all of them are following "the rules" for a project or language. It is fine if the compiler "understands" all the other conventions but I am not the compiler. For me is ".bashrc" something different than ".Bashrc". That is no extra burden for my brain. As I said: This is how it works!

Krux02 (orginal) [2016-03-10T03:09:26+01:00] view original

@Araq

Ok, the unit type example is quite bad and not really representative. Maybe I should just remove it or find a better example.

for the case sensitiveness, I think we have here very different opinions about it. When you have something called foobar and refer to it as foo_bar and get an error about it, you get annoyed because you want to focus on something else, on the other side, I see this as a reminder to use a consistent style everywhere. And I like it, because it will improve readability. I think, maybe we could get the best of both worlds, with some tool support that updates usages of identifiers to their style of declaration automatically. With this tool activated tools like grep will work again with their old reliability.

#code like this
var fooBar = 0
foo_bar = fo_ObaR - 1
# could automatically and safely converted to this
var fooBar = 0
fooBar = fooBar - 1

and for those of you who know German some fun:

var BlumentoPferde = 0
BlumentopfErde += 1
var UrinStinkt = 2
UrInstinkt += 3
var DuSchlampe = 4
DuschLampe += 5

OderWat (orginal) [2016-03-10T03:26:24+01:00] view original

First thing I see reading your code is the typo "UrInstrink" :)

P.S.: Helft den armen Vögeln!

andrea (orginal) [2016-03-10T10:26:02+01:00] view original

For me, the most annoying thing is

proc default to void, not auto

I see no good reason for this: it is inconsistent with how variable declaration works, and it requires a little more typing everywhere.

I think it would it be possible to change to defaulting to auto without breaking retrocompatibility:

procs declared with return type auto would still be auto

procs declared with return type void would still be void

procs declared without a return used to be void - now they would become auto, which hopefully still infers void.

I think this could be worth to do: there should be little issues, and the language would be more consistent and less verbose as a result

Tarmean (orginal) [2016-03-10T11:17:16+01:00] view original

I see the style insensitivity similarly to how I see it in powershell. Useful for one of stuff that is hacked together or interfacing with code in other styles but as soon as I want to ever reuse it I want it to be consistent. Having some flag/pragma that forces a certain style module wide might be nice but something like gofmt probably would be a better solution.

auto as default return type would be nice. You can't throw away values anyway so I don't think current code would have to be changed. Only thing I could see would be accidentally changing the return type but that would be obvious very quickly because suddenly the compiler prompts you to discard. Seems like checking multiple times for the same mistake?

I personally am still somewhat annoyed that there is no way to figure var type parameters out when looking at the call. Maybe nimsuggest could fix that via syntax highlighting, together with different highlights for ref/ptr types and deep/shallow/custom assignments.

Jehan (orginal) [2016-03-10T16:18:45+01:00] view original

move semantics/destructors/RAII

In my opinion, not needed if you have (1) garbage collection and (2) reasonably powerful higher-order functions or macros. Personally, I even think that RAII is broken, but that's another story.

proc default to void, not auto

I'm in the opposite camp, I find the way that it is in Scala annoying. Procedure declarations are part of the interface and should be spelled out in full, not inferred from implementation details. Defaulting to auto breaks information hiding big time. (The same also goes for other forms of type inference for procedure declarations.)

loops and iterators, algorithms for collections

This is where I like ML functors (i.e. modules that can be parameterized by other modules) better. D style ranges are a specific solution to a specific problem and lack generality.

seq and string init with nil

They are this way because unlike in C++, they're references underneath, not values.

I find that this isn't much of a problem in practice, because I always initialize variables explicitly. I would like it if variable initialization could be required so that the compiler checks it, though, and that's one of my bigger irritations with Nim. I.e. insofar as there's an issue, it's not limited to seqs and strings.

seq doesn't expose capacity

Agreed, though that should be fixable. In any event, that's at most a performance concern and one that can generally be avoided.

algebraic types / pattern matching

I agree that object variants can be unnecessarily tedious, though straight up ADTs also have some annoying limitations (Scala avoids most of them by unifying the concepts of ADTs and inheritance). Pattern matching is a bit oversold outside of textbook examples, IMO; it can be nice to have, but there are alternatives. Personally, better OOP support is further up my priority list.

coffeepot (orginal) [2016-03-10T16:36:56+01:00] view original

Style insensitivity

Just a different viewpoint: To me the style insensitive thing is just another aspect of case insensitivity. At the end of the day, having GLFloat and GL_FLOAT be different things is not really a great way of working for me. FWIW I'm a fan of case insensitivity though for the same reason you're not a fan of them, funnily enough, namely:

because the style of using capital letters and lower case letters varies very much from programmer to programmer, and things can become hard to read for no good reason

With CS you have to use their style else it's an error, potentially causing mixing of styles and making it harder to read, whereas with CI it doesn't matter.

Really, conflict occurs with things like OpenGL simply because the source file in C++/C is case sensitive, not because it's necessarily a good idea to have similar variable names with an underscore making their meaning different. It makes more sense to me to specify why they're different; GLFloat and glFloatID or whatever. I realise CI/CS is a hot topic in the programming world though.

I wonder if c2nim could be updated to handle this better but I guess most of the time it's impossible for the machine to reason what the original dev intended.

RAII

If this can be done as a library I'd be interested in it, because I think it would help embedded systems reduce memory footprint whilst not having to go full manual management.

loops and iterators, algorithms for collections

+1 for generic colleciton algos. I feel like this might just be because the language has evolved over time but rewriting core STL stuff has not had a high priority.

seq capacity

Yeah this would be nice. A simple change surely.

seq/string not nil

I've rambled about this before, and I agree it's odd that strings in particular are initialised to nil. I find it difficult to understand why you'd ever want a nil string though - but I do wonder if there's a performance reason for this internally.

Seq's I'm in two minds about. Sometimes I'll have a seq that is nil if it's not used (eg, in a variant branch), so I guess there you'd have to balance eager initialisation costs and != nil checks. However, defining a seq as not nil should get around this anyway, right?

I personally am still somewhat annoyed that there is no way to figure var type parameters out when looking at the call.

I feel like this is a tooling problem more than anything.

However, it does slow me down with type mismatches. Type mismatch errors:

Report the error position at the opening bracket, rather than the position of the parameter that causes the mismatch (unless this has been changed lately),

Lists the parameter types by their call site type rather than the 'root' type (so type aliases have to be looked up - I don't consider this a bad thing but because you don't get an index to the offending param, it can add to the time to work out which param is wrong),

Don't show the var-ness of parameters.

Because of these together, sometimes it can get awkward to work out where the issue is. If type mismatches reported the index to the offending parameter and/or showed 'var' for var parameters, this would resolve a lot of potential issues for new comers and head scratching when you get a type mismatch and the parameter lists look identical.

semantic whitespace

Is this an issue with semantic whitespace or just an issue with anonymous functions? That code wouldn't error if it wasn't inside brackets, right?

Krux02 (orginal) [2016-03-11T00:08:10+01:00] view original

Is this an issue with semantic whitespace or just an issue with anonymous functions? That code wouldn't error if it wasn't inside brackets, right?

The problem is the semantic whitespace within the open braces (. Nim expects a statement to be exactly one line, unless the line ends on an operator or there are still open braces. Then the next line will be parsed as a continuation of the same statement. Therefore within the braces there can't be semantic whitespace, which is a clear limitation. Of course nim has a solution to that, it allows to create a StmtListExpr in braces, but only with a special syntax for it that has braces, too.

Jehan (orginal) [2016-03-11T05:46:43+01:00] view original

There's a (hackish) workaround around the multiline issue:

import macros

macro multiline(e: untyped): untyped = e[6]

proc foo(s : string) : string = s & s

let x = 7
let xx = foo(multiline do:
  if x mod 2 == 0:
    "even"
  else:
    var s = "o"
    s = s & "dd"
    s
)

You can also use parenthesized expressions to work around this:

proc foo(s : string) : string = s & s

let x = 7
let xx = foo(
  if x mod 2 == 0:
    "even"
  else: (
    var s = "o";
    s = s & "dd";
    s
  )
)

And if you don't like the semicolons, use block: instead:

proc foo(s : string) : string = s & s

let x = 7
let xx = foo(
  if x mod 2 == 0:
    "even"
  else: (block:
    var s = "o"
    s = s & "dd"
    s
  )
)

Araq (orginal) [2016-03-11T11:05:03+01:00] view original

There is no issue with indentation based parsing per se here. The issue is legacy. We support

let x = if bar: ...
    else: ...

And so the indentation rules are different for statements vs expressions. I think we can fix it if we require the 'else' to be always on the same column as the 'if'. Last time I brought up this issue people preferred the currently implemented parsing rules. But IMHO we should change it to further unify statements and expressions. I'm not sure it's completely do-able though.

jibal (orginal) [2016-03-11T15:05:59+01:00] view original

("dot bash reference counter? wtf?"

FYI: According to the jargon file, rc stands for "runcom", from the CTSS system, circa 1962-63.

Krux02 (orginal) [2016-03-13T00:31:08+01:00] view original

@Araq As I am relatively new to Nim, I have no problems at all with getting rid of any legacy stuff. Simplifying the rules for parsing reduces the amount of allowed Nim code, and therefore also the complexity a human needs to understand other peoples code, so yes I agree with those simplifications. But if you breake compatability, you should at least offer code migration tools.

Mirror of forum.nim-lang.org

2111 :: Nim language aspects, that I don't learn to like