nimforum mirror - Testing `newruntime` `owned ref` as to cyclic data...

GordonBGood (orginal) [2019-08-26T12:30:57+02:00] view original

As you may have noticed in many of my recent posts, I have been a convert to the purported advantages of the newruntime over both Garbage Collection and common or even automatic ref reference counting as a means of not needing to do manual memory management of heap memory allocation/deallocation while being more compatible with threading than the current Garbage Collector, with the main two advantages over common ref reference counting as follows:

Performance advantages in that the actual reference counting is only done as a check that there are no "dangling" ref's left when the owned ref is destroyed, as these were said to be very costly in performance for threading when they must be atomic.

The newruntime was said to prevent memory leaks with memory not able to be reclaimed due to mutual references by cyclic data structures not allowing the reference count to drop to zero.

However, we find the following in the current implementation:

Reference counting is not able to be turned off as the source code says that would be an "unsafe" option. Now, this decision could revert back to the original specification when the newruntime has been fully implemented and tested, but meanwhile this implementation will be just as slow as common reference counting, especially for the atomic counting necessary for threading.

2. According to the following testing, at least cyclic data causes a crash so it is detected when reference counting is on as per the following code:

# test cyclic owned...

# compile with "--newruntime"

type
  ListObj = object
    head: int
    tail: owned (ref ListObj)
  List = owned (ref ListObj) # without parenthesis, "Error: region needs to be an object type"
proc `=destroy`(x: var ListObj) =
  echo "destroying ListObj" # to know when its being destroyed
  x.head.`=destroy`

template own[T](x: T): owned T = # complement to `unown`
  var rslt: owned T
  cast[ptr T](rslt.unsafeAddr)[] = x
  rslt

proc main() =
  proc makeones(): List = # a cyclic list of ones...
    result = List(head: 1); result.tail = own(unown result)
  let ones = makeones()
  echo "head of the list type and value: ", ones.typeof, " ", ones[]
  echo "first tail of the list type and pointer value: ", ones.tail.typeof, " ", cast[int](ones.tail)
  var onesv = unown ones
  stdout.write "The first 10 values of the cyclic list are:  "
  for _ in 1 .. 10:
    stdout.write onesv.head, " "
    onesv = unown onesv.tail; if onesv == nil: break
  "".echo

main()

with the output of the above program on Wandbox as follows:


head of the list type and value: List (head: 1, tail: ...)
first tail of the list type and pointer value: owned ref ListObj 140515572523088
[FATAL] dangling references exist
The first 10 values of the cyclic list are:  1 1 1 1 1 1 1 1 1 1
destroying ListObj
1

However, just because a crash is caused by cyclic data when reference counting checks are on, it isn't actually handling cyclic data as the current Garbage Collector does, meaning the newruntime will break current code that uses cyclic data.

Also, if the code is modified to destroy-chase the tail(s), the code goes into a recursive race with resulting stack overflow, which isn't a desirable outcome either. In order to make this work so as not to break any possible current code that uses cyclic data (not really a good idea), the work would need to be extended to do a "deepDestroy" as suggested in the Bacon/Dingle article as an extension in order to handle cyclic data, where first the non-owned ref's are cleared, then the owned ref's, and finally the whole structure can be destroyed. This can't be done with just object destructors as ref's and owned ref's are created by compiler magic and we can't make custom destructors for them.

If this example and my thinking are correct, we are doing a lot of work on something that doesn't seem to achieve the goals, unless we can extend some ideas to include this case?

cdunn2001 (orginal) [2019-08-27T20:30:01+02:00] view original

I'm not sure I see your point. Yes, existing cyclic datastructures would require significant modification in this new regime. But is your code even legal? Aren't you violating the single-ownership model?

Araq (orginal) [2019-08-28T08:23:24+02:00] view original

This can't be done with just object destructors as ref's and owned ref's are created by compiler magic and we can't make custom destructors for them.

How to deal with that problem is part of the spec, see .nodestroy.

If this example and my thinking are correct, we are doing a lot of work on something that doesn't seem to achieve the goals, unless we can extend some ideas to include this case?

I hope, it will achieve the goals that I claimed, I never claimed to care about this list-business, it's a dead-end for performance ever since we got hardware caches in the 90ies.

Reference counting is not able to be turned off as the source code says that would be an "unsafe" option.

Yes, you keep making this claim. It doesn't suddenly become true though. A dangling ref detector can always be turned off, just like I can decide to not run my code through Valgrind.

template own[T](x: T): owned T = # complement to `unown`
  var rslt: owned T
  cast[ptr T](rslt.unsafeAddr)[] = x
  rslt

Well at this point you're on your own and way beyond any proposal for the new runtime. You cannot get ownership back via casting.

GordonBGood (orginal) [2019-08-28T10:38:15+02:00] view original

@Araq: you know I am a convert to the newruntime, I'm just trying to take a step back to look at it from the point of view of those who are reluctant to accept its benefits:

see .nodestroy.

I saw it but also noted that it is deemed not to be ideal; it also only solves one part of the problem of preventing =destroy races when used, but doesn't allow one to apply custom destructors to ref's and owned ref's.

I never claimed to care about this list-business,

I know you don't, and I agree with you that lists have neither good performance nor memory storage efficiency when misused as many functional programmers and especially Haskell programmers tend to do (as in a list of char is a string :-) ), but lists can be useful in expressing some concepts concisely as in outside the code that needs to have high performance. I hark back to it just because I think some will criticize Nim if it doesn't have reasonable support.

A dangling ref detector can always be turned off

I only meant that the dangling reference detector can't be turned off in the current implementation and mentioned that I hoped that means would be provided to turn it of when not necessary (after debugging) in the final version.

You cannot get ownership back via casting.

I wasn't really trying to get ownership back; I used that casting just in order to get a cyclic structure, as I couldn't think of another way to generate one within the newruntime and I wanted to test whether the newruntime really does work or not with cyclic data of which a cyclic list is the simplest example.

I would really like to see how one legally generates a cyclic anything in `newruntime`, as if they can't be generated legally (and especially if the compiler could detect illegal use such as this, but perhaps not as casting is unsafe), then cyclic data isn't a concern.

This would be similar to Elm, where cyclic data is absolutely forbidden at any level where it would interfere with destruction, but being a purely immutable language and without pointers at all, Elm has the advantage that cyclic data can easily be detected at compile time and as cyclic "infinite" data structures are already detected and rejected, that only leaves generating a cyclic reference by deferred execution, which Elm also detects and rejects.

In contrast, Nim has the problem that a mutable reference that wasn't cyclic can become cyclic because the reference is modified to refer to something further up the tree (which is what I did with the strange casting for the tail of the list), in which case the problem I was trying to demonstrate comes up.

I feel about cyclic data about the same way as you feel about lists, and if you could find some reliable way for the compiler to detect and reject all cyclic data (other than perhaps at the global level as Elm allows), I would greet it with a sense of relief, as we would never have to consider it again other than to help library writers who have used it to replace it with something better!

GordonBGood (orginal) [2019-08-28T10:58:06+02:00] view original

@cdunn2001:

I'm not sure I see your point.

The point I was trying to make is that "reluctant adopters" might see the current implementation with reference counting of "dangling" references that can't be turned off costing execution speed, and needing to completely re-write code that might make high use of cyclic data a reason for rejecting the whole newruntime concept.

For myself, I don't think that cyclic data is a good implementation at any time, as it can cause memory leaks even for Garbage Collection, with which it works better but not perfectly. I would happily see cyclic data rejected completely (other than for unsafe code, of course, where we can't deny it).

But is your code even legal?

Probably not legal and @Araq confirms that; I used unsafe casting just because I couldn't think of another way to generate cyclic data.

Aren't you violating the single-ownership model?

In this example, probably not, as casting only ended up converting an owned ref to the same owned ref and was only done to circumvent the compiler detecting the assignment as a copy, which would be forbidden. I wanted both the list node and the contained "tail" reference to itself be owned, which doesn't circumvent single ownership as long as ownership isn't passed across to another outside entity. For instance, there is nothing wrong with a data structure that is owned from containing any number of nested owned and "dangling" things; the problems come about when an outside longer lifetime reference prevents the dangling ones from being cleared, the owned ones will be destroyed when the outer owner is destroyed.

Araq (orginal) [2019-08-28T12:23:20+02:00] view original

I feel about cyclic data about the same way as you feel about lists, and if you could find some reliable way for the compiler to detect and reject all cyclic data (other than perhaps at the global level as Elm allows), I would greet it with a sense of relief, as we would never have to consider it again other than to help library writers who have used it to replace it with something better!

Well B/D does mention that cycles are really rare but can come up. In Nim the situation is worse than that because we really want to support swap well. However, the cycles that cause trouble in practice for Swift etc. are those resulting from closures pointing back to some "upper layer" and afaict these cycles are prevented. Statically.

GordonBGood (orginal) [2019-08-29T07:32:34+02:00] view original

@Araq:

B/D does mention that cycles are really rare but can come up.

I'd love to see concrete examples of such cycles to see if the end can be achieved by other means...

the cycles that cause trouble in practice for Swift etc. are those resulting from closures pointing back to some "upper layer" and afaict these cycles are prevented. Statically.

Yes, that's how the Elm compiler (version 0.19) rejects cyclic data statically. Without mutation or pointers, they can always be detected as the only other way to cause them is through closures.

However, our closures can contain `var`'s/`ref`'s/`ptr`'s whose contents can be mutated to point back to "some upper layer" after creation, and I don't know that those can be detected. I suppose the same can be done in Swift/etc. and isn't to worry about as pointer manipulations are "unsafe behaviour"?

Having said that, I'm still musing and tinkering whether some variation of these ideas wouldn't work out better for Nim, as adding owned to everything is a pretty disruptive change.

Now that is interesting, as I understood you were completely committed to seeing how `owned` works out for `newruntime`.

Let me just throw an idea out for your consideration; it may be completely wrong, but then I have been before :-) , as follows:

Suppose that the "everything can be `owned`" principle that I raised in issue #12021 which the compiler already supports but you said shouldn't as "the compiler isn't ready for it yet" were to be accepted as the way the newruntime would work.

Further, suppose that the rule of "everything is created as owned" were enforced by the compiler, which "everything" would include all primitives and literals as well as all the other things you already consider as being created owned.

Now, we wouldn't have to worry about adding `owned` everywhere as that would be implicit.

The compiler already detects when we are trying to copy an `owned` and errors out, so why not have it inject an `unown` instead and proceed rather than forcing us to manually inject it.

If this were done, the main problem would be that the programmer couldn't easily tell by code inspection whether a given binding is `owned` or not as it may be difficult to keep track of whether a given use is "last read of" or not, but the compiler always knows through its data flow analysis as the whole B/D implementation for owned depends on it.

Thus, if the programmer tries to do something they shouldn't, the compiler can still give them an appropriate error message, or, if not, the "dangling" detection will catch it at run time if turned on (for debugging or otherwise).

The above is starting to stink of Rust's ideas of ownership, but at least it doesn't have a onerous syntax of lifetimes and reference notations to deal with (I hope; it's worse than the newruntime as already proposed if it does).

The one problem I can immediately see is how to handle the cases (reasonably rare, except for multi-threading) when multiple ownership needs to be handled, and for that I can't see any other way than a deepCopy or a reference counted reference, which are Rust's two solutions to the same problem (Clone traits that can include use of RC/ARC). I don't think we ever want to see system.deepCopy come back again, but maybe we just do something like override system.shallowCopy to actually do a deep copy for these cases, as perhaps shallow Copy isn't usually used, or maybe we have a new system.clone that normally does a shallow copy but can be overridden to do either this deep copy or a reference counted wrapper copy chosen by the programmer as suitable for the particular need?

It seems to me that such a system would offer the advantages of B/D of optional better performance than reference counting or Garbage Collection (especially when dangling checks can be turned off) and at least some detection of cyclic data if not the actual handling of it while not adding too much extra syntax burden and while also supporting an easy workaround when single ownership isn't appropriate.

But, OTOH, I may be completely wrong and such as system isn't possible in Nim... :-)

Mirror of forum.nim-lang.org

5134 :: Testing `newruntime` `owned ref` as to cyclic data...