Here is what I have so far:
http://nimrod-lang.org/blog/concurrency.html
http://nimrod-lang.org/blog/concurrency2.html
Please comment but don't reddit yet!
"You can simulate shared memory via message passing (which is what modern hardware does, to some extent) and you can use message passing to simulate shared memory."
Might want to look at that sentence again.
How do you assign to it over and over?
let it = b
while it != nil:
if it.word == word:
inc it.counter
return
it = it.next
Orion: Might want to look at that sentence again.
I don't see the problem with the sentence?
Orion: How do you assign to it over and over?
That should probably be:
var it = b
...
Some more feedback:
The example actually illustrates what I think are the major issues I see:
Furthermore, there's the following:
(Note that most of these are issues with the implementation and/or style; I think the basic approach is mostly sound, though I'd still like to give it a closer look once I have more time.)
[1] As an aside, I'm not a big fan of naming types based on computer science jargon. Types such as Unit and Future make perfect sense in the context of the work that introduced them, but are likely to be confusing to a newcomer who isn't immersed in the background.
*(1) No support for shared ref (yet). You'll notice that you'll never be able to free the contents of the hash table, as it contains data from 256 different worker threads, all mixed up.*
No. Every thread uses allocShared. It's all in the same heap.
I note also that allocShared() and allocShared0() currently have an int argument rather than a type argument, so it may be that you have changes planned that I don't know yet.
Indeed. allocShared gets a safer interface. It now returns a typed shared ptr.
(4) As I've mentioned before, there seems to be currently no support for read/write locks. Having multiple concurrent readers is critical for a number of performance-sensitive applications. A simple example is a parallel matrix multiplication operation, where you want to be able to obtain read locks rather than write locks on at least one of the input matrices.
Yeah well, I have not enough experience with read/write locks to integrate them properly. Of course wrapping posix stuff works and then you need the lockfree environment that I mention to keep the the type system happy. But feel free to improve the system!
(5) It looks as though spawn() and sync() are intended to work at the OS thread level (unless I'm misunderstanding you, but this is what the reference to createThread() seems to imply). Creating an OS thread is potentially a pretty heavyweight operation, and Cilk and other languages that use this construct have their own scheduler with lightweight threads for that reason (I think the Cilk reference manual mentions that creating one of their threads is only moderately more expensive than a procedure call).
No, instead it follows Cilk's approach indeed.
(6) It is unclear how spawn() and sync() deal with thread-local variables. There does not seem to be a place to provide hooks to have them initialized/cleaned up, and Nimrod does not allow them to be initialized in their declaration. (Actually, that's a bit of an issue with the current way Nimrod handles them, too.) While you can add such code in the main function, thread-local variables are often a hidden implementation detail of a module and the caller may not know how to initialize them.
That's the price of "light-weight" tasks and why there is both spawn and createThread. If you map thread local storage to task local storage in your design, you need more control and thus createThread.
(7) I'm not clear what Future[T] [1] being read destructively means when somebody actually tries to read it twice. Will it throw an exception, return nil or something else, etc.? I'm also not sure why it's necessary to introduce an operator for this rather than a function.
An operator is cooler. ;-) Wether it returns default(T) or raises an exception depends on what the implementation suggests. IMO It needs to be as fast as possible. However, read could also transmit any exception that the spawned proc raised to the thread that performs the read.
Araq: No. Every thread uses allocShared. It's all in the same heap.
Not the strings that you store (in the line x.word = word). When a string gets assigned, the deep copy implicitly does a shared allocation (at least looking at the code that's currently being generated).
Not the strings that you store (in the line x.word = word). When a string gets assigned, the deep copy implicitly does a shared allocation (at least looking at the code that's currently being generated).
word is declared as an array of char for this reason, but the code indeed uses == and =. Will correct it. And the type system catches the bug you mention: string can't be embedded in a shared ptr.
Araq: word is declared as an array of char for this reason, but the code indeed uses == and =.
Ah, I had overlooked it in the object definition (the inc procedure takes a string argument).
Araq: And the type system catches the bug you mention: string can't be embedded in a shared ptr.
This will severely limit its usefulness, then (I'm assuming the same holds for seq). In your example, you're essentially forced to either limit the maximum word length that you can store or to duplicate functionality that you already have elsewhere.
The underlying problem is that there's no shared ref (yet), which I would still argue is the more important use case (you're simply losing too much core functionality of Nimrod when you're forced to manually manage memory).
Araq: Yeah well, I have not enough experience with read/write locks to integrate them properly.
In principle, they do not work much differently from mutexes, except that you have to specify whether you want a read lock or a write lock (your deadlock avoidance approach should carry over, more or less). At the language level, you must also be able to guarantee that any data for which you only acquire a read lock does not actually get modified for the duration of the read lock.
The last example on the first part is confusing: since entry 9 is not protected by lock 0. There is no indication where this comes from. Previous to the example a sentence like this could be added: This system requires the programmer to know which lock corresponds to which variables. In the following case we will distribute evenly the array of ints so that eight consecutive values are handled by the same lock.
In the second article spawn is a teaser with "passes a thread proc to some underlying thread pool implementation and causes it to be run concurrently". spawn group and sync imply that there is some form of serialisation. That seems to map well to GCD, do you plan on implementing nimrod's thread pools on top of such systems (be it GCD or other)?
gradha: > In the second article spawn is a teaser with "passes a thread proc to some underlying thread pool implementation and causes it to be run concurrently". spawn group and sync imply that there is some form of serialisation. That seems to map well to GCD, do you plan on implementing nimrod's thread pools on top of such systems (be it GCD or other)?
I don't plan any GCD integration, but it seems like it's possible in theory so perhaps later we can add that.
"You can simulate shared memory via message passing (which is what modern hardware does, to some extent) and you can use message passing to simulate shared memory."
Says you can simulate shared memory with message passing, AND you can simulate shared memory with message passing
Araq: That essentially means we have not lock levels but read lock levels and write lock levels, right? That also means we need guarded read ptr and guarded write ptr afaict.
Neither seems to be necessary. A total or partial ordering that makes mutex acquisition deadlock-free will also make read/write lock acquisition deadlock free (been there, done that). Read/write locks have an extra potential for deadlock only because if two threads simultaneously want to upgrade a read lock to a write lock, it'll deadlock, and any irreflexive order prevents that. There is no need to have separate guarded ptr types, either: All you need to make sure is that a value that is locked as readonly in a lock statement is not modified within that lock statement (i.e., does not have any of its fields assigned to or gets passed as a var argument to a procedure). Example (with made-up syntax):
type L = ...
var a, b: L
proc reset(var x: L) = ...
...
lock a.rwlock as readonly, b.rwlock as readwrite:
b.data = a.data # correct
a.data = b.data # compile time error
reset(b) # correct
reset(a) # compile time error
Edit: Nevermind, I think I misinterpreted the {.guard.} pragma in part 2 to replace the protection mechanism in part 1 rather than to complement it.
Looking at the {.guard.} pragma, I'd recommend an improvement to it.
Right now, if you want to protect an entire object by it (which is a rather common case), you have to tag each and every field. That's just not very extensible. It would be nice to be able to do something like:
type Lockable =
guarded ptr object {.inheritable, guard: accessLock.}
accessLock: Lock[0]
template exclusive(obj: Lockable, body: stmt): stmt =
lock obj.accessLock:
body
to guard an entire object in case you don't want/need to micromanage access to individual fields.
Jehan: It would be nice to be able to do something like: object {.inheritable, guard: accessLock.}
Yeah ok. And your proposal that includes reader/writer locks looks good too.
Jehan: The underlying problem is that there's no shared ref (yet), which I would still argue is the more important use case (you're simply losing too much core functionality of Nimrod when you're forced to manually manage memory).
Ok, here is an idea to deal with this problem: We introduce shared ref, shared string and shared seq but implement them as a pair (pointer to allocator, ptr) and require manual memory management for them: An assignment of the form sharedRef = someRef then translates into: GC_ref(someRef); sharedRef = (<thisAllocator>, someRef). Since this runs in the thread that owns the allocator this requires no synchronization.
However the memory needs to be freed explicitly with GC_unref: Since this might be invoked by some other thread however, this would require synchronization. Instead we put them into a (protected) lockfree queue that belongs to the allocator. When the GC that belongs to this allocator runs again, these are considered (GC decrefs).