nimforum mirror - Problem using "spawn"

jzakiya (orginal) [2017-10-19T18:34:37+02:00] view original

OK, I've racked my brain enough and need help. Using 0.17.2 on Linux, I have this proc below.


proc segcount(row, Kn: int): int =
  var cnt = 0
  for k in 0..<Kn: cnt += seg[row + k].int
  result = cnt

It works when I use it as below.


proc segsieve(Kn: int) =             # for Kn resgroups in segment
  ...
  ...
  
  var cnt = 0                        # count for the segment primes '1' bytes
  for i in 0..<rescnt:               # count Kn resgroups|bytes each restrack
      cnt += segcount(i*KB, Kn)
  primecnt += cnt.uint               # update primecnt for the segment

But when I try to parallelize this using spawn I get this error.


proc segsieve(Kn: int) =             # for Kn resgroups in segment
  ...
  ...
  
  var cnt = 0                        # count for the segment primes '1' bytes
  for i in 0..<rescnt:               # count Kn resgroups|bytes each restrack
      cnt += spawn segcount(i*KB, Kn)  <-- error points to start of '+='
  sync()
  primecnt += cnt.uint               # update primecnt for the segment

-------------------------------------------------------------
[jzakiya@localhost nim]$ nim c --cc:gcc --threads:on --d:release ssozp5x1c1par.nim
Hint: used config file '/home/jzakiya/nim-0.17.2/config/nim.cfg' [Conf]
Hint: system [Processing]
Hint: ssozp5x1c1par [Processing]
Hint: math [Processing]
Hint: strutils [Processing]
Hint: parseutils [Processing]
Hint: algorithm [Processing]
Hint: typetraits [Processing]
Hint: threadpool [Processing]
Hint: cpuinfo [Processing]
Hint: os [Processing]
Hint: times [Processing]
Hint: posix [Processing]
Hint: ospaths [Processing]
Hint: linux [Processing]
Hint: cpuload [Processing]
ssozp5x1c1par.nim(154, 16) Error: type mismatch: got (uint64, FlowVar[system.int])
but expected one of:
proc `+=`[T: SomeOrdinal | uint | uint64](x: var T; y: T)
proc `+=`[T: float | float32 | float64](x: var T; y: T)

[jzakiya@localhost nim]$

Stefan_Salewski (orginal) [2017-10-19T19:42:16+02:00] view original

Do you use the parallel statement at all as described in the manual? Or only a plain spawn? See

https://nim-lang.org/docs/manual.html#parallel-spawn

You may also need a FlowVar.

I did test parallel once for calculation of a convex hull, see

https://forum.nim-lang.org/t/483/2

Was not really fast at that time, but I think that is fixed already.

That example is even included in

Nim/tests/parallel/tconvexhull.nim

jzakiya (orginal) [2017-10-19T20:14:26+02:00] view original

I've done it both with/out parallel: as shown below, but get the same compiler output.


  parallel:
    var cnt = 0                      # count for the segment primes '1' bytes
    for i in 0..<rescnt:             # count Kn resgroups|bytes each restrack
      cnt += spawn segcount(i*KB, Kn)
  sync()
  primecnt += cnt.uint               # update primecnt for the segment

I'm also using spawn earlier in segsieve which works with no problems, and actually does operate in parallel, which I can verify by looking at the program's operation using htop.

Below is the total segsieve code.


# This routine performs the prime sieve for a restrack of Kn resgroups|bytes.
# 'nextp' resgroup vals for restrack 'r' mark prime multiples on it in 'seg'
# and are udpated for each prime for the next segment.
proc residue_sieve(row: int, seg_rti: int, Kn: int)=
  for j, prime in primes:            # for each prime r1..sqrt(N)
    if nextp[row+j] < Kn.uint:       # if 1st mult resgroup is within 'seg'
      var k = nextp[row+j].int       # starting from this resgroup in 'seg'
      while k < Kn:                  # for each primenth byte to end of 'seg'
        seg[seg_rti + k] = 0         # mark byte in segment as nonprime
        k += prime                   # compute next prime multiple resgroup
      nextp[row+j] = uint(k - Kn)    # save 1st resgroup in next eligible seg
    else: nextp[row+j] -= Kn.uint    # do if 1st mult resgroup not within seg

# Count the primes on each row of Kn resgroups|bytes in 'seg' memory.
proc segcount(row, Kn: int): int =          # for this row in 'seg' of Kn bytes
  var cnt = 0
  for k in 0..<Kn: cnt += seg[row + k].int  # add primes '1' (and nonprimes '0')
  result = cnt                              # return count of primes for 'row'

# This routine performs the total prime sieve for Kn resgroups|bytes by
# processing each residue track individually (in parallel). Then the
# segment primes count is computed and added to global var 'primecnt'.
proc segsieve(Kn: int) =             # for Kn resgroups in segment
  for b in 0..<seg.len: seg[b] = 1   # initialize seg bytes to all prime '1'
  parallel:
    for r in 0..<rescnt:             # for each residue track number 'r'
      let row  = r * pcnt            # set the 'nextp' table row address
      let seg_rti = r * KB           # set the segment mem row address
      spawn residue_sieve(row, seg_rti, Kn) # mark the prime multiples along it
  sync()
  #parallel:
  var cnt = 0                        # count for the nonprimes, the '1' bytes
  for i in 0..<rescnt:               # count Kn resgroups along each restrack
      #cnt += segcount(i*KB, Kn)
      cnt += spawn segcount(i*KB, Kn)
  sync()
  primecnt += cnt.uint

I'm trying to get segcount to operate in parallel too, which should make the program even faster. When I get this working I'll write this all up and update my a paper to show the new parallel algorithm architecture, and the Nim implementation.

Stefan_Salewski (orginal) [2017-10-19T20:42:45+02:00] view original

I think we will not be able to compile your code, as it looks not like a complete program.

Do you really expect that

cnt += spawn segcount(i*KB, Kn)

may work? I have no idea how it could.

Maybe what you intent is something like

parallel:
    var cnt = array[rescnt, int]
    for i in 0..<rescnt:
        cnt[i] = spawn segcount(i*KB, Kn)
    sync()
    for i in 0..<rescnt:
      primecnt += cnt[i].uint

Such a shape would make some sense for me, but I have not used Nim's parallel in the last two years, so I would have to consult the manual.

jzakiya (orginal) [2017-10-20T09:09:29+02:00] view original

After reading the Nim in Action book I got it to compile by placing a ^ before spawn, but it makes the program slower. The problem has to do with segcount returning a FlowVar[T] mismatch. And when I use parallel: it won't compile, and shows even more errors. Doing more research.


  var cnt = 0                        # count for the primes, the '1' bytes
  for i in 0..<rescnt:               # count Kn resgroups|bytes each restrack
      cnt += ^segcount(i*KB, Kn)
  primecnt += cnt.uint               # update primecnt for the segment

Stefan_Salewski (orginal) [2017-10-20T09:31:18+02:00] view original

Now in your code there is no spawn at all!

For parallel processing, you have to ensure that there are no conflicts when parallel tasks are accessing your data, otherwise the compiler may make copies of the data before, which may make it slow. And for parallel processing a good use of the CPU cache is also important -- many parallel processes will give no speed increase when data is always fetched from slow RAM instead of cache.

jlp765 (orginal) [2017-10-20T11:49:39+02:00] view original

Try


 for i in 0..rescnt-1:
        cnt[i] = spawn segcount(i*KB, Kn)

I think the issue is with ..<

jzakiya (orginal) [2017-10-20T16:18:33+02:00] view original

In the previous snippet I forgot the spawn. The code below compiles, but is slower.


  var cnt = 0                        # count for the primes, the '1' bytes
  for i in 0..<rescnt:               # count Kn resgroups|bytes each restrack
      cnt += ^spawn segcount(i*KB, Kn)  <-- the '^' gets it to compile, but threads wait to finish
  sync()
  primecnt += cnt.uint               # update primecnt for the segm

Changing 0..<rescnt to 0..rescnt-1 has same error below. Even when I do a while loop it throws the same error.


ssozp5x1c1par.nim(156, 11) Error: type mismatch: got (uint, FlowVar[system.uint])
but expected one of:
proc `+=`[T: SomeOrdinal | uint | uint64](x: var T; y: T)
proc `+=`[T: float | float32 | float64](x: var T; y: T)

The problem seems to be when segcount returns its output their is a type mismatch with cnt (?).

jzakiya (orginal) [2017-10-20T16:29:21+02:00] view original

When I use parallel I get this compiler error:


  parallel:
    var cnt = 0'u                      # count for the nonprimes, the '1' bytes
    for i in 0..rescnt-1:              # count Kn resgroups along each restrack
      cnt += spawn segcount(i*KB, Kn)  <-- points to start of '('
  sync()
  primecnt += cnt
-------------------------------------------------------------
ssozp5x1c1par.nim(155, 28) Error: 'spawn' must not be discarded

Stefan_Salewski (orginal) [2017-10-20T16:38:44+02:00] view original

Why do you refuse to try


cnt[i]

as jlp765 suggests?

Do you have an idea how plain


cnt +=

should work? All the parallel calculated results should accumulate in this single variable. Then you may need something to control the access to it.

jzakiya (orginal) [2017-10-20T18:11:15+02:00] view original

The error messages keep saying the issue is a mismatch with FlowVar[T]. In Chapter 6 of Nim in Action here is what it says they are.

FlowVar[T] can be thought of as a container similar to the Future[T] type, which you used in chapter 3. At first, the container has nothing inside it. When the spawned procedure is executed in a separate thread, it returns a value sometime in the future. When that happens, the returned value is put into the FlowVar container.

Here is updated segcount


proc segcount(row, Kn: int): uint =
  var cnt = 0'u
  for k in 0..<Kn: cnt += seg[row + k].uint
  result = cnt

So segcount is returning a uint value. This works perfectly well as below with no type mismatch.


  var cnt = 0'u
  for i in 0..<rescnt:
      cnt += segcount(i*KB, Kn)
  primecnt += cnt

But using spawn causes a type mismatch: cnt += spawn segcount(i*KB, Kn)

No matter what kind of container for cnt you use a type mismatch error appears.

jlp765 (orginal) [2017-10-20T23:51:58+02:00] view original

try


var cnt = array[rescnt, int]
parallel:
    for i in 0..rescnt-1:
        cnt[i] = spawn segcount(i*KB, Kn)
sync()
for i in 0..rescnt-1:
   primecnt += cnt[i].uint

jzakiya (orginal) [2017-10-21T03:54:36+02:00] view original

OK, I had to clean it up a little to make it work, but here is the code that gets it to compile.


  var cnt: array[rescnt, uint]
  parallel:
    for i in 0..rescnt-1:
      cnt[i] = spawn segcount(i*KB, Kn)
  sync()
  for i in 0..rescnt-1:
    primecnt += cnt[i].uint

So was the issue that the single value of cnt was getting clobbered by each thread's return value, causing the type mismatch?

Thanks for getting it to compile, but if you can explain why this works I'd appreciate it even more.

jzakiya (orginal) [2017-10-21T06:01:30+02:00] view original

The cnt += operation in parallel is ripe for creating a reduction like option for Nim that's in OpenMP.

https://stackoverflow.com/questions/13290245/reduction-with-openmp#13290673

Mirror of forum.nim-lang.org

3257 :: Problem using "spawn"