The example below would fail complaining that it's not gsafe, and let can't be changed into const.
How it should be handled, is there a directive to tell Nim compiler to put let into globally shared memory?
import mimetypes, threadpool
let mime = new_mimetypes()
proc parse_format(): string =
mime.get_ext("text/html", "unknown")
var cresp = spawn parse_format()
echo ^cresp
import mimetypes, threadpool
proc parse_format(mime: any): string =
mime.get_ext("text/html", "unknown")
var cresp = spawn parse_format(new_mimetypes())
echo ^cresp
Passing as a parameter won't work, it's a simplified example. The actual usage is in multi-threaded web server, where some helper methods detect MIME type.
Using gcsafe block forces it to compile, and it works. But... not sure how safe to use it, maybe it would crash randomly or stop working with the next Nim version.
I'm also tried to use locks, not working either.
var mime_lock: Lock
init_lock mime_lock
let mime {.guard: mime_lock.} = new_mimetypes()
proc parse_format(): string {.gcsafe.} =
with_lock mime_lock:
mime.get_ext("text/html", "unknown")
var cresp = spawn parse_format()
echo ^cresp
Error: 'parse_format' is not GC-safe as it accesses 'mime' which is a global using GC'ed memory
Hmm, I'm not sure, how to use locks then?
A second though about forcefully using `gsafe` block. Actually, it should be safe to use it in this case.
There going to be no race conditions as the data is immutable. And, this memory can't be erased, as the variable would be never garbage collected, and it's in the main thread that's never going to be terminated.
So, it should be safe?
The general rule is to avoid sharing refs across threads. Nim's main GC algorithms (ARC, ORC) do not provide the guarantee that reference-counting is thread-safe.
And more specifically, even protecting access to the root of an object graph through a lock is generally insufficient, because even if your object graph is immutable, the GC can mutate the refcount under the hood in a non-thread-safe way. All access to all objects in the graph would need to be protected through a lock or some other consistency mechanism.
In your case, if new_mimetypes() is an expensive operation, you could store the result in thread-local storage to cache the result locally and avoid sharing across threads.
Something like this:
import mimetypes, threadpool, options
var localMimeTypes {.threadvar.}: Option[MimeDB]
proc getMimeTypes(): MimeDB =
if localMimeTypes.isNone:
localMimeTypes = some(new_mimetypes())
result = localMimeTypes.get
proc parse_format(): string =
getMimeTypes().get_ext("text/html", "unknown")
var cresp = spawn parse_format()
echo ^cresp
P.S. There's also a way to use {.threadvar.} and keep separate copy for each thread. But it defeats the whole point of having multi threaded server that could optimise memory by sharing some common data between threads.
Yes, you're right that it's sub-optimal, and I don't believe there's a way to do this safely with refs today in the general sense, without having to deal with locking/consistency mechanisms and taking into consideration specific access patterns.
If I understand correctly, this may become possible through the use of view types. https://nim-lang.org/docs/manual_experimental.html#view-types
In the mean time, there's lots of way to share things across threads (sharedtable, smartptrs module from fusion, etc.) There's just no great solution for refs, AFAIK.
there's lots of way to share things across threads
I was wondering, would it be currently possible in Nim to do something like a database query? Make one thread responsible for storing large data object. And while not sharing the data directly, allow other threads to query the data (submit a query function and get back small chunk of non-ref data)?
Something like a pseudocode below:
import mimetypes, threadpool, strformat
# Data thread, storing huge data object -----------------------
proc run_data_thread() {.thread.} =
# Large data object, available only for this thread
let mime = new_mimetypes()
# Listening for query requsts from other threads, and responding with only
# small portion of data, the result of the query.
on_query((query_fn, arg) => reply(query_fn(mime, arg)))
var data_thread: Thread[void]
createThread[void](data_thread, run_data_thread)
# Some other thread -------------------------------------------
proc run_some_thread() {.thread.} =
# Querying the "Data thread" for some information
let format = data_thread.query((mime, arg) => mime.get_ext(arg), "text/html")
echo format
var some_thread: Thread[void]
createThread[void](some_thread, run_some_thread)
I just realised there's also the ptr. Could it be used?
import mimetypes, threadpool, strformat
var mime: ptr MimeDB
mime = create(MimeDB)
mime[] = new_mimetypes()
proc parse_format(): string {.gcsafe.} =
mime[].get_ext("text/html", "unknown")
var cresp = spawn parse_format()
echo ^cresp
@alexeypetrushin Unfortunately, accessing any graph containing ref objects using multiple threads is generally unsafe. This is true even if the access starts through a pointer; the pointer doesn't make it any safer, even if the graph is immutable. It would work if it was just a plain object structure (no refs), but it isn't foolproof if involves refs.
AFAIK, the use of refs across threads can work [*] only under the following conditions:
[*] I'm using "can work" in a narrow sense here because it's still generally unsafe because maintaining all the conditions above is challenging (understatement). The compiler currently doesn't help provide these guarantees.
The last condition is particularly difficult to enforce because it's easy to create external references into the graph without realizing. ("But Maaaaa, I'm not mutating anything!")
# this "works"
let foo = someGraph.somePath.someRef
# this will get you into trouble
var foo = someGraph.somePath.someRef
# this will also get you in a world of trouble
myLocalObject.fooRef = someGraph.somePath.someRef
Notice how the code above never mutates the graph explicitly! But the graph is still mutated under the hood by the GC to maintain reference counts.
We're "lucky" that some of Nim's popular types (e.g., strings, seqs) have value semantics instead of being refs ... otherwise, we'd see memory corruption errors pop up a lot more frequently.
To give an example closer to home, you might just be accessing a JsonNode (which is a ref btw) and putting the node into another data structure, e.g. jsonResponse.body = sharedJsonNode looks totally innocuous. Now your code contains a potential race under multi-threading, and may some day lead to a memory corruption.
PS: I don't want to spread FUD, so if anybody knows about this better than I do, please correct me. I'm not an authority on this by any means, this is just my understanding based on reading about ARC/ORC, destructors, comments on Github issues, etc.