Also, there's .noSideEffect. pragma which means, judging from the definition from the manual:
With other effects not specified in the manual; memory allocations via new() and exceptions are allowed, apparently.
There's no tracking of at least:
I assume that allocations are covered by .new. already even when they happen via alloc().
Overall, I like the power that it gives, it's the most powerful framework that I've seen outside of functional programming, but it feels messy and incoherent.
Tracking of reads will be done for globals but apart from that it provides not much over simply assuming everything reachable from the arguments is read. The main reason is that I fear the memory footprint of the compiler would go up significantly with a more precise tracking.
Syscalls are essentially all marked with a more concrete "tag effect" so I'm not sure what would be the advantage of tracking them differently.
Deallocations cannot be dealt with the effect system that Nimrod implements. However, also planned are tracking of locks which means deadlocks can be detected at compile time. To detect data races you need another form of type parametrization which I like to avoid, so it's unclear of whether it will be implemented. IMHO it's not incoherent at all. :-)
"exceptions and memory allocations are allowed" Only with new(), not with alloc(). Actually even dealloc(alloc(1)) fails to compile.
I suggest putting the list in the docs.
I've seen the mention of compile time evaluation. But to some extent it could be used to avoid double computation. If nothing that a function reads has changed since the previous call, it can be called once. Exceptions add some mess because compiler has to handle both the case where they get thrown and when the don't, but that's still faster than double call. Read tracking could improve precision of detection, though I have no idea how much would having a good precision help.
Syscalls? I've looked at thread creation. No tag. Now I looked slightly deeper, IO has tags, so do timers. So there is indeed some support for effects of syscalls, but it's incomplete.
As to incoherencies:
Araq: any routine with constant arguments which has no side-effects is a candidate for implicit compile time evaluation (this last rule is about to change, it will become a requirement rather than a possible optimization)
Unless I'm misunderstanding something here, this strikes me as being potentially dangerous, since even a side-effect free function can take an indefinite amount of time to evaluate. Or do you mean constant folding/propagation?
While I don't share the concerns about incoherence, I do wonder if it's not simply too complicated and whether most programmers will therefore avoid it. In general, I'm not sure about the cost/benefit ratio, but will wait to see all of it.
Finally, focusing on deadlocks over race conditions also strikes me as an odd trade-off. You can prevent deadlocks without language support if necessary; the same does not hold for data races.
Jehan > Unless I'm misunderstanding something here, this strikes me as being potentially dangerous, since even a side-effect free function can take an indefinite amount of time to evaluate. Or do you mean constant folding/propagation?
Shrug in my opinion that's an academic problem. The useful endless loops all have side effects and so won't be run at compile-time. For the other loops we can try to run them at compile-time and make the compiler bail out after a number of iterations. Endless loops without side effects are bugs, it's better to detect them at compile-time.
Finally, focusing on deadlocks over race conditions also strikes me as an odd trade-off.
Well deadlocks are simple to handle with the effect machinery, races are harder. My focus has nothing to do with it. I agree races are the much greater problem in practice.
You can prevent deadlocks without language support if necessary
What mechanism in particular do you have in mind here?
Mścigniew > exceptions and memory allocations are allowed" Only with new(), not with alloc(). Actually even dealloc(alloc(1)) fails to compile.
Yes because 'new' can be evaluated at compile-time but 'alloc' cannot.
I've seen the mention of compile time evaluation. But to some extent it could be used to avoid double computation. If nothing that a function reads has changed since the previous call, it can be called once.
Yeah and I have planned an optimizer that makes use of that fact. ;-)
Syscalls? I've looked at thread creation. No tag. Now I looked slightly deeper, IO has tags, so do timers. So there is indeed some support for effects of syscalls, but it's incomplete.
Well we also have FReadEnv/FWriteEnv for tagging of environment variables and FReadDb/FWriteDb for databases etc. Thread creation misses a tag, that's true. What else did we miss?
As to incoherencies: ...
So please make a concrete proposal of how to improve the spec.
Araq: > Endless loops without side effects are bugs, it's better to detect them at compile-time.
They may be attacks on compiler infrastructure. Though a large new() would work like that too...
Araq: > What else did we miss?
Oh well, that would require going through the whole library. I'm not into that task now.
Araq: > So please make a concrete proposal of how to improve the spec.
Tags that take parameters? Seem to be generic enough to cover all cases, though I don't like how sometimes parameters would be types, sometimes variables and closures over them.
And it would be good to have .noSideEffect. either removed or defined in terms of the effect system. I like the former more - because a .noSideEffect. proc may produce effects; the name doesn't work well.
Araq: > Yes because 'new' can be evaluated at compile-time but 'alloc' cannot.
Is it a current limitation or is there a strong reason to never be able to do it?
Mścigniew > Tags that take parameters? Seem to be generic enough to cover all cases, though I don't like how sometimes parameters would be types, sometimes variables and closures over them.
Well parameterized effect systems are exactly what I seek to avoid as these really are too complex afaict.
Araq: The useful endless loops all have side effects and so won't be run at compile-time.
I wasn't thinking so much of endless loops, but of stuff that takes a significant amount of time (especially when run in an interpreted VM).
Araq: What mechanism in particular do you have in mind here?
It depends on how fancy you want to get. The basic idea I was thinking of was the classical and simple mechanism of imposing a partial or total order upon locks and require that they be acquired in a way that respects that order. In practice that will probably already handle 99% of all use cases. But you can also do more complicated things, such as a contract-based approach if you need it.
My point is that you can do a lot of things with respect to deadlocks (which is mostly just about the interdependence of locking operations) at the library level. Race detection is the more difficult problem to handle, because it generally requires compiler support.
Edit: To be clear, I don't want to discourage you – I find it exciting that someone wants to tackle these problems in a novel way – my questions are the result of genuine curiosity. The overlap of programming languages and concurrency in particular is the core area of my research, so I'm always on the lookout for new ideas.
Jehan: I wasn't thinking so much of endless loops, but of stuff that takes a significant amount of time (especially when run in an interpreted VM).
Well with the vm2 branch Nimrod runs code faster at compile-time than Python at runtime. :P But as I said, the compiler can always bail out.
Jehan: It depends on how fancy you want to get. The basic idea I was thinking of was the classical and simple mechanism of imposing a partial or total order upon locks and require that they be acquired in a way that respects that order.
Alright, that's part of what the effect system will provide. It's pretty restrictive though as it doesn't allow for aliasing of locks. There is however another simple mechanism that handles the very important use cases where you need aliasing. But I'll finish my blog post about it soon, so no spoilers here. ;-)
I respectfully disagree with you slightly. ;-) The traditional "bank transaction" example is prone to a rather subtle deadlock:
# transform all the money from account b to a:
lock a:
lock b:
let v = b.value
b.value = 0
a.value += v
This almost always works unless 2 threads access the same accounts at the same time where for thread X a aliases location L1 and b aliases L2 but for thread Y a aliases L2 and b aliases L1. And even worse, this dangerous form of nested locking is actively encouraged in languages like Java and C# with their "every object can be used as a lock" feature!
I think we can safely agree that Java and C#'s basic shared memory model is pretty much broken (there's plenty of good stuff in C#'s TPL/PLINQ and Java's Fork/Join framework, but they are orthogonal). I tend to refer people to Per Brinch Hansen's write-up as to why that's the case.
The bank transaction example is generally a non-issue. If you have to design a precedence system for bank accounts, the safe and natural choice is to give all locks associated with accounts the same precedence (precedences generally mirror layers in an abstraction/composition hierarchy). This means that in the above example, lock b would immediately result in an error, and you'd rewrite it as:
lock a, b:
let v = b.value
b.value = 0
a.value += v
My standard example where simple systems break down is a webserver with plugins, where it can be difficult to construct a simple precedence system (especially one where you can be assured that testing will expose bugs).
Another tricky problem is the nested monitor problem, where you have an inner and an outer lock, and a condition variable associated with the inner one. In C:
Thread 1:
pthread_mutex_lock(outer_mutex);
pthread_mutex_lock(inner_mutex);
pthread_cond_wait(inner_cond, inner_mutex);
pthread_mutex_unlock(inner_mutex);
pthread_mutex_unlock(outer_mutex);
Thread 2:
pthread_mutex_lock(outer_mutex);
pthread_mutex_lock(inner_mutex);
pthread_cond_signal(inner_cond);
pthread_mutex_unlock(inner_mutex);
pthread_mutex_unlock(outer_mutex);
While the inner mutex is implicitly unlocked by pthread_cond_wait() for its duration, the outer mutex is never unlocked, so the second thread can never send the signal and both will block forever.
At the high hand of the complexity scale, there's hairy stuff such as doing per-node locking for balanced trees, which are extremely difficult to handle both statically and dynamically (though capability-based systems can handle them in principle) [1].
For the sake of completeness, I note that a completely different approach is to accept that there will be errors in deployed programs; such a system could afford to go with detecting deadlocks at runtime (analyzing the wait-for graph) and recover from deadlocks that occur (using micro-reboots or a similar technique).
[1] Red-black trees are also the standard example where STM works extremely well; of course, STM has other problems if you seek a fully general model.
Jehan: The bank transaction example is generally a non-issue. If you have to design a precedence system for bank accounts, the safe and natural choice is to give all locks associated with accounts the same precedence (precedences generally mirror layers in an abstraction/composition hierarchy)
Alright that's very good point. So you don't use the precedence in the implementation of the multi-lock statement.
At the high hand of the complexity scale, there's hairy stuff such as doing per-node locking for balanced trees, which are extremely difficult to handle both statically and dynamically (though capability-based systems can handle them in principle)
Thanks for this reference. I wasn't aware of this paper. (I do know Brinch Hansen's write-up.) This looks vastly complex though and in particular things like "Extending our core language with method calls is relatively straight- forward. For reasons of space, we only sketch this extension." are not encouraging. ;-)
Araq: This looks vastly complex though and in particular things like "Extending our core language with method calls is relatively straight- forward. For reasons of space, we only sketch this extension." are not encouraging.
Yeah. :) As always, there's a tradeoff between how much more expressiveness and how many fewer software defects you get for the effort invested; diminishing returns are a concern. Unfortunately, academics get evaluated based upon their publications, not based upon the code they write, so there is limited incentive to maximize implementation effort (on the other hand, if you want to get in a top tier CS conference, then actual, working code is a huge advantage, and a solid piece of software – once you're over the hump of actually having written it – can be a pretty constant source of new publications; it's just that the initial time investment can be pretty huge).
Araq said: So lets make {.noSideEffect.} mean {.tags: [], writes: [no global or thread local variable], reads: [no global or thread local variable], raises: [infer].}
This looks like a "no ambient authority" property from capability security. Perhaps a good name would be "parametricReadWrite" or "parametricEffectsOnly", since reads/writes via parameters are the only allowed effects (well, and exceptions).
Some more thoughts: debugEcho() hacks around the effect system. There are more operations like this. For example debugLog(). On some systems using stderr for communication between the program and the programmer is not feasible. It would be useful for the effect system to distinguish between 2 types of effects. I don't know if there's established naming for them, so I'll use my own: Direct Effects. Are the reason why you call a proc. For example open() opens a file Side Effects. Implementation details that just happen. Their presence is independent from correctness of function implementation. They are often desired, but need not be tracked by the programmer or the compiler. For example calls to logging functions. Or putting an object in some kind of cache. Or creating a weak reference.
If compiler skipped Side Effects in the analysis, this would enable better separation of APIs from implementations, more concise effect declarations and more optimisation opportunities for the compiler. Though it would pose a danger of misuse.
You can hack around the effect system yourself by cast'ing the routine you want to invoke to some type lacking the effect annotations. This needs to be documented, of course, but no further features are necessary.
Your idea smells a lot like Java's checked vs unchecked exceptions which failed in pratice so I'm skeptical.
If compiler skipped Side Effects in the analysis, this would enable better separation of APIs from implementations, more concise effect declarations and more optimisation opportunities for the compiler. Though it would pose a danger of misuse.
So the compiler should be allowed to optimize away code that performs logging but is dead otherwise? Sounds dangerous even to me. ;-)
Yeah, but as long as it's a hack, it doesn't make things simpler. It makes them more complex and more unexpected.
I don't know Java, will look into it at a later date.
I disagree that optimizing out procedures that do logging would be overly dangerous. And for quite few code bases it would be a major win.