Totally new to Nim. I've been going through the forums and I think I understand how to achieve what I want, but I'm looking for verification and some clarification.
I want to load a significant amount of data into memory, and then run ad-hoc queries against it (via an http interface). Loading a copy of the data per thread isn't realistic.
On startup, I can load all the data from the main thread. When a request comes in, I can safely and effectively launch X worker threads, feeding each one a partition of the data to operate on. None of the workers write to this shared global data, but they each have their own result structure. Once all the workers are done, the thread handling the requests takes each worker result and merges them together.
If I wanted to workers to share a single result (say a thread-safe dictionary), that isn't particularly well supported in Nim today, right?
Also, I assume it's ok for the main thread to update the shared global data, so long as it applies some type of locking (either globally, by stopping the web server from handling requests while it updates, or with more granularity where the workers would need to read-lock their partition). I haven't looked at the web component at all, but I assume the main thread could launch the web server in a separate thread, then just do an infinite loop with a sleep and every minute just see if the data needs updating.
Am I on the right track?
Nim today has two kind of heaps. By default, you use per-thread, garbage collected heaps. There is also a shared heap which is not garbage collected, and requires manual allocation.
What you can do is:
In order to update the shared data safely, you will need to use some locking, or perform the operations in a suitable order such that the shared data structure is always valid (if possible at all), say by constructing a parallel structure and updating a single pointer at the end (or doing the same piecewise)
First, thanks a lot for helping me out.
I'm sorry to sound stubbornly stuck on my approach, but I'm curious why you'd recommend that I manually allocate memory for the global data. At least according to this post from Aarq, as long as the GC'd data is created by the main thread, I can share it. That's an old post, so maybe it's no longer true?
I came up with this preliminary code:
proc run(threadCount: int) =
let chunkSize = int(ceil(data.len / threadCount))
for i in 0..<threadCount:
let start = i * chunkSize
let stop = if i == threadCount-1: data.len else: start + chunkSize
var slice = data[start..<stop]
spawn process(slice)
sync()
But I realize that this passes a deep copy of slice. If slice is large, this is a significant performance and memory hit.
Instead, I now call
spawn process(addr(slice))
and get the data back via:
proc process(p: ptr seq[Data]) {.thread.} =
var data = cast[ptr seq[Data]](p)[]
...
This seems to be working, and it seems to be quite efficient. Of course, if I plan on having my master thread update data, I'll need to add some locks. That's fine.
In general, is this reasonable? More specifically, it seems like the advice around cast is don't use it unless you know what you're doing. So I don't know what I'm doing, but I'd like to learn. What are the dangers/pitfalls? This feels "safe" to me because I know that my global GC'd data is going to outlive the call to process. Is there something else to fear?
I have to say I am really not sure myself, but I would guess that the line
var data = cast[ptr seq[Data]](p)[]
performs a copy of the data in the thread-local heap. But I might be wrong.
Actually, I think it creates a copy of the pointer (which is mostly harmless / useless). I believe I should just be doing:
var data = p[]
Still not sure about the general approach though, but thanks, it seems ok so far. Maybe I'll get lucky and someone else will chime in