nimforum mirror - Multi-threading and data sharing

pge (orginal) [2019-10-10T10:41:38+02:00] view original

Hi, I discovered nim 2 months ago, and it's certainly exciting: the efficiency / readability combination is extremely promising! But... I translated java -> nim a (chess engine) program. Hoping to get a better executable regarding memory consumption and speed.

Up to now, my single-threaded version works correctly and with excellent results. Next step is multi-threading design ... Going through the nim documentation (threads, spawn, locks, threadpools) leaves me in a perplex mood.

Question : Is it possible to multi-thread the engine with the following requirements? :

Data structure

Nodes and Links as structured objects. A Node contains seq[Links] - and other fields. A Link contains 2 Node references - and other fields.

All nodes are stored in a large Table var (~10e6 ... 10e8 entries)

Some threads should create new Nodes and Links, ans put the Nodes in the Table (>10e5 per seconds)

Other specialized threads should update the 'other fields' of the Nodes.

One thread should eliminate unneeded Nodes and Links.

Exploring the nim thread doc ("Each thread has its own (garbage collected) heap and sharing of memory is restricted to global variables."), I am rather discouraged. Would it be some nim way to achieve these goals ?

Tnx for any help, and congratulations for the nim achievements anyway !

Fil

Araq (orginal) [2019-10-10T14:26:49+02:00] view original

One way is to use --gc:boehm and {.gcsafe.}: <block here> to make the compiler shut up. Alternatively use the new hooks (=sink, =, =destroy) or a library on top of them to manage the memory as you seem fit. Unfortunately I'm not aware of such libraries but 1.0 finally enables them so I expect them to arrive soon. :-)

shashlick (orginal) [2019-10-10T14:53:24+02:00] view original

I put together shared to build shared strings and seqs. It isn't very efficient since there are many copies but is only 1000x slower. Can definitely do with optimizations but like Araq mentioned, --gc:boehm is an easier way to do this.

cumulonimbus (orginal) [2019-10-10T16:42:23+02:00] view original

gc:boehm / .gcsafe. would make the compiler shut up about accessing another thread's data, but one would still get the spawn interference warnings if detected, right?

Araq (orginal) [2019-10-10T17:30:02+02:00] view original

I think so, yes.

pge (orginal) [2019-10-11T18:28:50+02:00] view original

Thank you for the valuable suggestions. I will explore those solutions and let know my success with them. However I suggest (for the friendliness of Nim in the future) to have something more accessible to usual programmers, like SharedTable, SharedSet ... (intended to be accessible from parallel threads).

mratsim (orginal) [2019-10-12T01:00:55+02:00] view original

Don't update concurrently the same data structure that will create a huge contention bottleneck due to the heavy synchronization and our program might become slower than single-threaded due to cache thrashing.

Assuming your algorithm is tree-like (I'm familiar with go bot but not chess bot) ideally you have a tree-datastructure and:

Either you launch a thread on separate branches and keep the synchronization restricted to 2 threads (sub-branches) a.k.a. branch-parallelism

Or you duplicate your data structure per-thread to avoid paying synchronisation cost a.k.a. tree-parallelism

If you want to know how big the synchronization cost can be my fibonacci benchmark of GCC implementation of OpenMP versus LLVM shows a factor 100..000x ( https://github.com/mratsim/weave/tree/master/benchmarks/fibonacci ).

Today you can use my experimental weave code that gives you async like semantics and very efficient multithreading, for usage see:

https://github.com/mratsim/weave/blob/master/e04_channel_based_work_stealing/async_internal.nim#L138-L180

https://github.com/mratsim/weave/blob/master/e04_channel_based_work_stealing/async_for_internal.nim#L129-L151

It's experimental code nonetheless it's working and backed by 3 years of PhD research. I suggest you submodule the library.

In the future I hope to make it a proper high-level library, see Project Picasso RFC.

Alternatively you can use Nim threadpools but they suffer from the same issue as GCC OpenMP:

it uses a single global queue that enqueues/dequeues all tasks

consequently it cannot do load balancing (work-stealing)

it chokes if tasks are small (say 1ms/task) and the queue datastructure becomes the contention point.

pge (orginal) [2019-10-13T10:19:23+02:00] view original

Given:

The goals defined in the first post of this topic

the answers received (thank you !)

Nim limitations "Each thread has its own (garbage collected) heap and sharing of memory is restricted to global variables."

My struggling but unsuccessful attempts to have multiple threads accessing a common global data structure (
```
 growFunc0(ch:int64) {.gcsafe.} =  echo "started " & $nodeMap.len
```
leads to growFunc0' is not GC-safe as it accesses 'nodeMap' which is a global using GC'ed memory with any compiler option)

The limited documentation of nim regarding this

The far-from-straightforward solutions suggested

... it seams that the reasonable action would be to switch back from nim to c++ (and possibly get back to nim when multi-threading evolves)

... unless someone has a working example of similar goals achieved with reasonably readable code.

Stefan_Salewski (orginal) [2019-10-13T11:07:55+02:00] view original

... it seams that the reasonable action would be to switch back from nim to c++

I think in your initial post you where talking about Java?

Currently multi-threading and parallel processing may be not optimal for Nim, but I think it will improve soon, maybe with newruntime or new GC with better threading support.

But my feeling is, that generally rewriting Java or C++ code to Nim just to improve performance may make not much sense. If you can live with C++ and already have learned it well, or can tolerate high memory consumption and startup time of Java, I see not a very pressing reason to switch to Nim.

Of course this is only my personal view.

dom96 (orginal) [2019-10-13T14:57:58+02:00] view original

I see not a very pressing reason to switch to Nim.

There are many reasons:

Nim is safer than C++

Java is a horrible language to write software in

Also, OP might just wish to learn Nim. That’s a very good reason to rewrite an existing project, in fact, it might be the best way to learn.

pge (orginal) [2019-10-13T22:43:27+02:00] view original

I am extremely interested in Nim for various reasons including performance and safety. But for conceptualization (fast translation of concepts into software), none of 15 languages I had to practice may compete with java. Stefan_Salewski, I appreciate your fair opinion. I'll be back to Nim some day, hoping a true multi-threading support.

mratsim (orginal) [2019-10-22T09:10:10+02:00] view original

If there is one thing Java is good at it's multithreading with shared references, it's basically the only language with a GC and multithreading support.

However for high performance compute heavy engine, it will be much slower and memory hungry than both Nim and C++ due to pointer indirections/locality of reference issues.

From a performance perspective, a single data structure hammered by multiple threads is a bad idea, whichever language you chose. That is an architecture issue not a programming language issue. Your engine will not scale to 16 cores whether you use C++, java or Nim.

Araq (orginal) [2019-10-22T13:09:17+02:00] view original

My first reply suggested to embrace --gc:boehm. Why doesn't it work for you?

Mirror of forum.nim-lang.org

5321 :: Multi-threading and data sharing