Just showing off some work I did this weekend porting some internal code to the nesper library. It's an RPC library taken from nim-json-rpc but with async support cut out -- shout out to the Status-IM Folks for the handy rpc macros! It supports either JSON or MessagePack as the transport mechanism over TCP sockets. Meaning you can re-use all of your standard json-rpc libraries and knowledge in the embedded world. :-)
Normally JSON on an embedded device is tedious and the libraries tend to be relatively large (in embedded terms) so many prefer Firmata (based on MIDI) or similar. Nim makes it a cake walk however:
import nesper/servers/rpc/rpcsocket_json
# import nesper/servers/rpc/rpcsocket_mpack
# Setup RPC Server #
proc run_rpc_server*() =
var rt = createRpcRouter(MaxRpcReceiveBuffer)
rpc(rt, "hello") do(input: string) -> string:
result = "Hello " & input
rpc(rt, "add") do(a: int, b: int) -> int:
result = a + b
rpc(rt, "addAll") do(vals: seq[int]) -> int:
result = 0
for x in vals:
result += x
echo "starting rpc server on port 5555"
logi(TAG,"starting rpc server buffer size: %s", $(rt.buffer))
startRpcSocketServer(Port(5555), router=rt)
I did some profiling on the overhead. It takes about 12ms using messagepack and 18ms using JSON when passing an array of 300 ints. Given the ping RTT on my network to an ESP32 Feather/Huzzah is about 4ms (broadcast), that means the time to service the RPC request is about 8-14ms. Not bad! Note its compiled using ESP-IDF debug and Nim release settings.
The RPC doesn't use async since I'd like to use ARC instead of ORC to avoid that overhead. Instead it uses select on the LwIP socket library. I'd like to explore implementing an async library using the FreeRTOS event loop, but that's beyond me at the moment.
Does anyone have ideas on how to make an async library for Nim using the esp's event loop? Could it be done using ARC only?
--opt:size -d:strip -d:noSignalHandler
Theres also a program named sstrip, that you can try.
The RPC doesn't use async since I'd like to use ARC instead of ORC to avoid that overhead.
Have you measured the overhead? What is it's like?
Does anyone have ideas on how to make an async library for Nim using the esp's event loop? Could it be done using ARC only?
I'm quite sure it can be done, other purely refcounted systems have async too (Rust, C++, Swift). But how the callback types are setup needs to be redesigned then. That was so much work for us than we gave up and added a cycle collector to ARC. I hope with strategic acyclic and cursor annotations the overhead of the cycle collector can be pushed into the "noise" level.
Have you measured the overhead? What is it's like?
Sort of, but not with the RPC setup. It doesn't seem too bad with the asynchttp, but it'd need proper http load tester instead of just curl.
I'm quite sure it can be done, other purely refcounted systems have async too (Rust, C++, Swift). But how the callback types are setup needs to be redesigned then. That was so much work for us than we gave up and added a cycle collector to ARC. I
That makes complete sense. It looks like it'd be a lot of work to redesign. Rust wasn't able to use async in embedded (with ![no_std]) for a long time either. Though more because they didn't have their standard library.
I hope with strategic acyclic and cursor annotations the overhead of the cycle collector can be pushed to the "noise" level.
That'd be excellent, and seems plausible. From basic testing, the cycle collector overhead doesn't seem too bad, but for my embedded use cases I'm worried about any pause occurring at the wrong time.
For the high speed ADC devices I use even a few uS is enough to drop readings since DMA doesn't work with them. Using ARC it's pretty easy to reason about delay times... though as writing this I'm rethinking the ORC pause times. My tight ADC loops will avoid any heap allocation/dealloc's anyway, so if ORC only runs on a decrement then I'd be good!
It's correct that the ORC cycle collector will only run at/on GC ref/decr's?
Currently I'm writing my code to use the FreeRTOS xQueue primitives for sharing data between cores and keeping my RPC calls purely synchronous which fits my use case but it'd be fun to have a full async RPC system.
Also, it's good to note that ARC works perfectly on the ESP32 multi-core heap! I had to do a GC_ref to keep ARC from deleting the data after passing it on the FreeRTOS data queue.
@Yardanico / @juancarlospaco -- I'll try some of those flags
Does anyone have ideas on how to make an async library for Nim using the esp's event loop? Could it be done using ARC only?
First step would be implementing support for it in the selectors module. Mostly just need an implementation for these functions, but in practice you can get away without implementing all of them: https://github.com/nim-lang/Nim/blob/devel/lib/pure/selectors.nim#L46-L230.
Once you add support there things should just work. Of course, you'll still need to figure out if you can wrangle ARC for your use case. I am playing with embedded devices nowadays myself, so this might be the motivation I need to figure out a solution here. But to be honest, for my use cases wrapping the standard ESP libraries is enough (HTTP server is all I need).
First step would be implementing support for it in the selectors module. Mostly just need an implementation for these functions, but in practice you can get away without implementing all of them: https://github.com/nim-lang/Nim/blob/devel/lib/pure/selectors.nim#L46-L230.
Oh nice, selectors already work for networking events. I'll look into adding support for timers! I'm not familiar with Nim's async and adding custom user driven events. That's what I'd need to async wrap events on the esp32.
Once you add support there things should just work. Of course, you'll still need to figure out if you can wrangle ARC for your use case. I am playing with embedded devices nowadays myself, so this might be the motivation I need to figure out a solution here. But to be honest, for my use cases wrapping the standard ESP libraries is enough (HTTP server is all I need).
Well asynchttp does work with ORC pretty well out-of-the-box! For most cases, ORC is probably fine as @Araq notes. I'd recommend asynchttp over the ESP http library. It's much nicer, there's an example in simplewifi. Though I haven't looked into TLS yet. I'm hoping there's some Nim libraries for that, as the C based ones are beasts.
@Araq ran a simple HTTP load test on my ESP32 Huzzah/Feather. Here's the command:
echo 'GET http://192.168.1.2:8181/' | vegeta attack -format=http -connections 1 -keepalive -max-body -1 -rate 20 | vegeta encode > results.json
Results:
Requests [total, rate, throughput] 386, 20.05, 20.04
Duration [total, attack, wait] 19.262s, 19.25s, 12.358ms
Latencies [min, mean, 50, 90, 95, 99, max] 10.332ms, 15.513ms, 14.694ms, 19.289ms, 20.773ms, 29.024ms, 98.589ms
It's not a perfect comparison. But it's only a few ms slower than the RPC calls running with ARC and no async. Which is exciting. Though if the requests run too quick it seems to crash after a bit. Likely a ESP/FreeRTOS lock issue with their select api.
It's correct that the ORC cycle collector will only run at/on GC ref/decr's?
Correct. And only on decrefs that are not decrefs to zero.
But you can also disable it via GC_disableMarkAndSweep and run GC_fullcollect when you know you can afford it. These operations are now poorly named, there is no "mark and sweep" backup GC and a "full collection" isn't a full collection either... I have no idea if we should simply introduce new APIs instead of emulating the old ones.
> I have no idea if we should simply introduce new APIs instead of emulating the old ones.
I think it's a good idea; memory management is evolving with 1.4, so now is the good time to introduce gcs API changes. IMHO, It would clarify things, because calling a GC_disableMarkAndSweep on ARC / ORC is simply confusing.