nimforum mirror - Nim improvement process

Krux02 (orginal) [2016-08-10T16:49:07+02:00] view original

First of all, this is a metatopic, it is not about Nim, it is about the development process for Nim.

I really like this programming language, and I see a lot of potential for it, and I would really like to contribute to it to make it a better language, but I do not know how I can contribute most effectively. To better understand me, I would like to give an example. I recently rediscoverd the operator equals and destructors. But I think it is not ready to use.

discard on a returned value from a proc creates a leak (no =destroy is called). Top level variables (globals) do not get destroyed. The assignment operator is used for both the first time initialization and the reused initialization, but It is unanswered weather operator = should deinitialize (destroy) the left operand. And this compile error:


proc `=destroy`(obj: var MyType)
for a 'var' type a variable needs to be passed, but 'a' is immutable

Not everything I mentioned is a bug it might be intentional, but I think it is at least worth talking about each item.

If I write a proposal, on how I can come up with a solution to these problems, I would like the best possible feedback on in before implementing it. On the other hand, I would like my proposal to have the option to get some important status so it will not get lost over time just because of some inactivity. I am thinking of a formal protocol that proposals should go.

My current options to start a proposal are, I can create a thread here on this forum, or I can create an issue on github. Recently I wore an article here on the forums, where I got the offer, that it could be published as a "guest blog article". So I like the idea to start as a forum post, and then transition into something of more weight. But to make it short, I don't have a solution for this yet. But I would like to start the discussion here.

Araq (orginal) [2016-08-10T18:07:59+02:00] view original

Destructors are inherently difficult to design and I'm well aware they are currently unusable. :-)

There are mostly 2 different ways to design them: C++ style which assumes the compiler cannot do any lifetime analysis but then allows for certain assignment optimizations in the spec (and more recently it requires them iirc) and Rust style which bases it on top of its superb lifetime analysis. The beginnings of a lifetime analysis for Nim is in compiler/writetracking.nim. This computes "deep immutability" and whether the memory coming from a parameter is escaping. Without more source annotations. So ... first we need to get this into production ready shape before tackling destructors...

Araq (orginal) [2016-08-10T18:53:26+02:00] view original

Regarding the development process: Forum posts like yours are perfectly acceptable to start a discussion but then the results should be extracted into a "Nim enhancement proposal" (NEP) that are for new kept on the github wiki. (We only have NEP-1 so far, the Nim style guide.)

Krux02 (orginal) [2016-08-10T19:10:53+02:00] view original

When I might ask, what is the goal of this escape analysis? I know in Java and Go escape analysis is used to find objects that can be allocated on the stack instead of the heap, because the programmer has not the power to decide weather something needs to be on the heap or on the stack.

As far as I know, the current version of nim has explicit stack variables exactly like c++. Do you want to secretly allocate those variables on the heap, in case their addresses escape (extend their lifetime like Go), or do you want to throw a compilation error like Rust?

Araq (orginal) [2016-08-10T19:37:01+02:00] view original

Stack vs heap doesn't concern me all that much. Here is an example of code that is currently allowed but shouldn't:

var heap: cstring

proc cstringEscapes(c: cstring) =
  heap = c

var x = "Nim string"
cstringEscapes(x)

For convenience conversions to cstring are implicit, but only save if the cstring cannot escape the proc. The same reasoning can be extended to destructors: Only if the file handle does not escape, we need to call the destructor. However, here linear typing helps so that the file handle stays "unique". There is an important distinction to be drawn between "might escape" (concerns for safety) and "does always escape" (so we cannot leak it).

Krux02 (orginal) [2016-08-10T22:19:48+02:00] view original

Why do you want the code to be illegal? I can imagine use cases where something like that is exactly what you want. I do agree that this is a bit ugly. And how far do you want to go? Since cstring is a pointer type, should this be illegal, too?:

var heap: ptr int

proc intPtrEscapes(ip: ptr int) =
  heap = ip

var i = 123
intPtrEscapes(ip.addr)

If you really want to prevent all dangerous usages, you can restrict cstring to be allowed as parameters for procedures. Then you can be sure when someone want's to keep a value of a cstring somewhere he needs to call $ or something similar to copy the value.

But isn't cstring the type you mostly need for writing wrappers? The code where you are often forced to do some "unsafe" operations in order to get it running?

Tarmean (orginal) [2016-08-10T22:59:15+02:00] view original

Well, addr and ptr are explicitely unsafe. cstring isn't and will still be invalidated once x goes out of scope. Ideally it would require you to make a deep copy of the cstring somewhere or explicitely shallow copy, I guess?

Krux02 (orginal) [2016-08-10T23:38:07+02:00] view original

That's exactly where I think the problem is. cstring should be explicitly an unsafe type, and users who use it should know about how it behaves. It's a type that is just for wrappers, where you have to deal with unmanaged pointers anyway.

And no matter how you treat the cstring, it will be an exception to every other type if you do not treat it just like a pointer. Otherwise it will be it's own special reference type that needs it's own special handling in escape analysis.

OderWat (orginal) [2016-08-11T02:37:19+02:00] view original

Don't forget that there are other backends like javascript and PHP which are using cstring as the "native string type". This needs to be decoupled from the "c-wrapper" meaning of cstring. I think there may be monsters...

Krux02 (orginal) [2016-08-11T09:49:54+02:00] view original

First of all, there is a PHP backend‽ Second, isn't the usage of cstring for javascript wrappers an abuse of that type that already clearly says it's for C? For me cstring will continue to be nothing more than an alias type like this:

type cstring {.unchecked.} = ptr array[0, char]

And I would truely be surprised if it would be anything else than only this and only this. And generally programmers don't like surprises. I would even go so far and say that cstring should only be defined when nim is on the c backend. But don't take my words on this too serious, since I don't have experience with the javascript backend, nor did I ever truely dig into the javascript world.

cjxgm (orginal) [2016-08-11T10:03:10+02:00] view original

Well, I think we discussed this before, but "c" in cstring means "compatible" not the C language, and it's a magic type which may be different in different backend instead of unchecked ptr to array of char.

cstring* {.magic: Cstring.} ## built-in cstring (*compatible string*) type

OderWat (orginal) [2016-08-11T12:17:54+02:00] view original

@Krux02 yes there is a (experimental) PHP backend which actually is even used as fallback in production for Nim driven Zend Modules. @Araq works for our company so we can experiment more freely :)

Besides that: cstring is a very important part of interfacing Javascript to. Just look at the types in nim-screeps

Krux02 (orginal) [2016-08-11T12:31:43+02:00] view original

according to the manual:

The cstring type represents a pointer to a zero-terminated char array compatible to the type char* in Ansi C. Its primary purpose lies in easy interfacing with C

There is no mentioning of any usage for that type in javascript.

maybe something like this would make it a bit more transparent than a magic pragma

when cbackend:
  type cstring {.unchecked.} = ptr array[0, char]
else:
  type cstring = ref otherstringtype

OderWat (orginal) [2016-08-11T13:13:48+02:00] view original

@Krux02 looks nice to me ... why don't you try it? I did that in the past too for some ideas of me which where supposed to be easy. But then I found out that stuff seldom is as simple as it seems :(

Araq (orginal) [2016-08-11T14:53:03+02:00] view original

Well that became offtopic rather quickly...

cstring used to mean "C string", now it means "compatible string", a weird backronym and the manual needs an update.

Whether my example should produce an error or a warning or nothing is besides the point. The real question is: What do you want destructors for? File-like stuff? Getting rid of the GC eventually? Reference counting? Optimize the RC ops introduced by the existing GC? Unique and borrowed pointers? How much of this lifetime tracking is done at runtime and how much is done at compile-time? Is the analysis control flow dependent?

Krux02 (orginal) [2016-08-11T17:32:41+02:00] view original

I think I have my answer for my original question already. I will just a proposal about what I would like to change here in the forums. It then has full potential to be moved into an official document, whatever form that might be. I really like the interactive part here, and the forum is not too crowed that it would disapper too quickly.

To answer your questions, I really like to use destructors for anything that would need a defer, meaning, closing files, closing connections, closing dynamically loaded libraries, and freeing memory. And if I use them for that, they really need to be called reliably at the end of the scope. Not like java finalizers. I already don't use garbage collection, and that is already a feature at this point in time, why I like Nim. I would use the ref type only for things, where the ownership is not only shared, but the lifetime of both owners is independant of the other. For example when I need to share an object between threads. But that rarely happens in my current code, not at all. Unique pointers from c++ are nice, but they need move semantics in the language, which does not exist in Nim, at least not explicit.

Mirror of forum.nim-lang.org

2456 :: Nim improvement process