Hi all,
I have translated into Nim a simple shared queue based on this blog post. Please share feedback on how you would improve it memory/performance-wise.
Thanks!
Could you give the example for concurrent enqueue and concurrent dequeu?
I think shared queue is (almost) no different with simple queue if it's only single thread.
sure, I have updated the code with the following example:
proc producer(q: ptr SharedQueue[int]) =
q[].enqueue(10)
q[].enqueue(20)
q[].enqueue(30)
q[].enqueue(20)
proc consumer(q: ptr SharedQueue[int]) =
echo q[].dequeue()
echo q[].dequeue()
echo q[].dequeue()
echo q[].dequeue()
echo q[].dequeue()
echo q[].dequeue()
let q = cast[ptr SharedQueue[int]](allocShared0(sizeof(SharedQueue[int])))
var t1,t2: Thread[ptr SharedQueue[int]]
initSharedQueue(q[])
createThread(t1, producer, q)
createThread(t2, consumer, q)
joinThreads(t1,t2)
Thanks, do you not consider using double-linked list as queue?
Also, in that post mentioned "Mutexes are relatively heavy" but then in his post he used lock & unlock for dequeue.
Although I think even for reading the queue whether it's empty or not, you still need to make it atomic or locking to avoid data-races.
@Araq:
Looks ok but suffers from the usual problem that allocations/deallocations are ignored and these too can block. (In pretty much every allocator that is used in production. Note that Nim's is particularly bad since it uses a single global lock in allocShared...)
Dang it ! What would you rather suggest ? I saw that nodes in sharedlist.nim hold an array instead of a single value, but then it uses a single global lock even for iterating (that's the reason I wrote this in the first place)
But I don't usually write my own concurrent data structures, hence the feedback request :-)
@mashingan:
Thanks, do you not consider using double-linked list as queue?
I don't know, in which way would a doubly-linked list make a better concurrent queue ? I thought that having to update both ends of a node (prev,next) would make it slower.
Also, in that post mentioned "Mutexes are relatively heavy" but then in his post he used lock & unlock for dequeue.
true, but he did benchmark that. What would you rather suggest?
Although I think even for reading the queue whether it's empty or not, you still need to make it atomic or locking to avoid data-races.
what do you mean ? dequeueNode() uses a lock to return the result
I don't know, in which way would a doubly-linked list make a better concurrent queue ? I thought that having to update both ends of a node (prev,next) would make it slower.
My bad, since you use cas for the operation, it indeed only (or have to) update one end. I saw you put while in both queue and dequeue so I think double-linked would be faster, O(1) btw.
true, but he did benchmark that. What would you rather suggest?
Transaction, since you're using atomic. If you insist avoiding lock, make the dequeue a transaction. Although I don't know which is costly between using lock-and-unlock mutex or failed transaction. The original poster said "Mutex are relatively heavy", I don't know relative to what.
what do you mean ? dequeueNode() uses a lock to return the result
Again, I just realized you're using Option instead of usual proc that check whether queue is empty or not.