nimforum mirror - Malebolgia holds its own vs goroutines (on max cpu cores)

lucian (orginal) [2026-02-12T06:27:34+01:00] view original

pardon the click bait, but glad to see "concurrent" word counting https://github.com/LucaWolf/wordFreq-nim/tree/concurrent being slightly faster than its counterpart https://github.com/LucaWolf/wordFreq/tree/concurrent, for the imposed core count. Still, I would not go about abusing them threads like Go allows. Interestingly, slower than the single threaded version.

P.S. have used as an exercise to learn about using Malebolgia and threading channels, hence the opportunity for comparison -- not chasing speed here.

Araq (orginal) [2026-02-12T06:37:47+01:00] view original

Interestingly, slower than the single threaded version.

Alright, but that makes it a toy then. ;-) Process files via memfiles and split up the work in same-sized chunks, ignore lines, a newline is just another way to write a space then.

cblake (orginal) [2026-02-12T11:43:59+01:00] view original

Araq's ideas are almost exactly https://github.com/c-blake/adix/blob/master/tests/wf.nim if concrete code helps @lucian/anyone (under 120 lines with 2 definitions of "word" even..). ( And it scales up with core count pretty perfectly, though I haven't tested on any crazy 192 core monster. )

mratsim (orginal) [2026-02-12T14:17:17+01:00] view original

That's not an interesting benchmark for threadpools.

I have a compilation there https://github.com/mratsim/weave/tree/master/benchmarks and I'm unfortunately missing the most interesting one and stressful one UTS (Unbalanced Tree Search) https://github.com/bsc-pm/bots/tree/master/omp-tasks/uts

Summary

Name	Parallelism	Notable for stressing	Origin
Black & Scholes Option Pricing (Finance)	Data parallelism		PARSEC (Princeton Application Repository for Shared-Memory Computers)
BPC (Bouncing producer-Consumer)	Task Parallelism	Load Balancing (Extreme)	Dinan et al / Tasking 2.0 (A. Prell Thesis)
DFS (Depth-First Search)	Task Parallelism	Scheduler Overhead	Staccato
Fibonacci	Task Parallelism	Scheduler Overhead (Extreme)	Cilk
Heat diffusion (Stencil / Jacobi-iteration - Cache-Oblivious)	Task Parallelism		Cilk
Matrix Multiplication (Cache-Oblivious)	Task Parallelism		Cilk
Matrix Multiplication (GEMM, BLAS)	Nested Data Parallelism	Compute, Memory, SIMD Vectorization, reference bench for super-computers	BLAS, Linpack
Matrix Transposition	Nested Data Parallelism	Nested loop	Laser
Nqueens	Task Parallelism	Speculative/Conditional parallelism	Cilk
SPC (Single Task Producer)	Task Parallelism	Load Balancing	Tasking 2.0 (A. Prell Thesis)
Histogram	Parallel Map-Reduce	Contention	Stack Overflow
LogSumExp (needed for Softmax cross-entropy in machine learning)	Parallel Map-Reduce	Huge matrices and expensive functions	Machine Learning

Mirror of forum.nim-lang.org

13714 :: Malebolgia holds its own vs goroutines (on max cpu cores)