I watched this video about concurrency and python: https://www.youtube.com/watch?v=MCs5OvhV9S4
It's a fun little video. His code from the video here: https://github.com/dabeaz/concurrencylive
I wanted to see how nim would handle it. Here was my first stab using asyncio: (all gist files at: https://gist.github.com/jots/158e6e1934e7acc9dd40 ) https://goo.gl/HNpQ6O
Then I run perf2 against it: https://goo.gl/tp4dI2
It does about 28k-30k per second.
Then I from another shell: nc localhost 25000 and enter something like 45 to make it do some work and perf2 drops down to 0 per second. (just like python!)
Then I found: http://goran.krampe.se/2014/10/25/nim-socketserver/ and modified it to keep listening on the socket and serve back fib results: https://goo.gl/FCagyA
It does between 25k and 35k per second.
Here's the weird thing though: nc localhost 25000 enter 45
and the request per second jumps up to over 60K!
What is going on there? Do I have a bug (probably) or does that really some how magically speed it up?
And how to make the async version work?
I really like the multi threaded version that Göran came up with. I wonder if anyone has thought of grafting jester on top of it?
That's a really nice test.
The fact that the spawning version's req/s go up when you manually nc into it is indeed strange. No idea why that might be and I don't have time to investigate it right now. Would be interested to hear other's thoughts.
As for the async version, the reason it drops down to 0 req/s is because you're blocking the CPU with your fib calculation. The async version does not use multiple threads so it cannot calculate fib in parallel.
joe: What is going on there? Do I have a bug (probably) or does that really some how magically speed it up?
What you see is kind of the opposite of what is mentioned in the video: the scheduling of linux.
Running ./threadfibserver and ./perf2 on a multicore system will put both processes on a different core. Since these processes are talking to each other all the time, this will create some overhead. If they would run on the same core, the communication would be faster.
If you are running on linux, you can enforce that by using taskset 01 ./threadfibserver and taskset 01 ./perf2. That will pin both processes to the same core. On my machine, that makes it a lot faster.
When running an extra heavy process (like calculating fib(44)), linux will schedule that process to its own core. With a bit of luck this also schedules the two other processes on the same core as a side effect. That has the same effect: they will communicate a lot faster.
That's probably what you see. You will probably see the same thing if you would run any other heavy process.
This kind of prooves the point of the video :-)
Ah yes, it was perf2/threadfibserver forced onto same core that sped it up. mystery solved! thanks wiffel! :-)
So it seems to me that @gokr's thread server might be a really nice core for something like jester and then use async inside of each thread if needing to wait on multiple IO events to complete. This way if a user's request is hungry for cpu for some reason it won't affect the rest of the requests, also you can write more normal synchronous code inside of the thread (db requests in serial etc.).
@joe: A small remark.
I'm not sure, but I think the use of spawn in your second version of the code is not correct. It launches an endless (or very long) loop that potentially blocks other requests. I did test it on a 1-core machine and it did indeed block.
As far as I understand it, the semantics of spawn is: run concurrently if possible or just run sequentially. So, the code should still be valid if spawn is removed.
If my understanding is correct, launching a long running process with spawn (and expecting the main thread to continue running) seems incompatible with that.
I’m not very sure about this, so maybe somebody more knowledgeable on this topic could comment on this?
I did create a version of your code using createThread instead. This seems to work OK on a single core machine: https://gist.github.com/wiffel/98aa977c1ac4f98c6e03
I’m not very sure about this, so maybe somebody more knowledgeable on this topic could comment on this?
I'm still tinkering with threadpool and spawn and with what guarantees we can make. It's obvious that parallel should mean "optional parallelism here", but a standalone spawn shouldn't block.