nimforum mirror - setupForeignThreadGc() equivalent for Boehm GC?

drkameleon (orginal) [2019-11-25T20:17:28+01:00] view original

I finally found out what was wrong with my code. The thing is it works far faster with Boehm (compared to the default GC I mean), but when trying the command with --gc:boehm it complains that the command does not exist.

Any ideas?

drkameleon (orginal) [2019-11-26T13:25:35+01:00] view original

So... any idea?

Stefan_Salewski (orginal) [2019-11-26T17:16:07+01:00] view original

So... any idea?

Your posts are a bit hard to understand, at least for me, and I have no idea on what project you are working and what you intend. What is

when trying the command with --gc:boehm it complains

You can compile your program with something like nim c --gc:boehm myprog.nim to use boehm GC, but I think you know that and have successfully done it, because you wrote that it works faster with boehm?

For setupForeignThreadGc() it seems to use the GC for which your program is compiled, I don't think that we can use multiple GC's in a single program. See

https://forum.nim-lang.org/t/3965

drkameleon (orginal) [2019-11-26T17:56:15+01:00] view original

It's an interpreter: https://github.com/arturo-lang/arturo

Basically, so far I had no-GC whatsoever and try to find out what to make the whole project WITH a GC. (And then how to tune it)

Araq (orginal) [2019-11-26T21:58:31+01:00] view original

This code is one reason why every GC hates your code:

ValueRef {.union.} = ref object
        s: string
        a: Array
        d: Context
        f: Function
        when not defined(mini):
            bi: Int

.union is for C interop only, you put a Nim string inside, kaboom. And that's probably just the top of the iceberg, you also use .computedGoto where it currently is simply ignored and that's even a good thing as you misapply computedGoto...

drkameleon (orginal) [2019-11-27T09:43:28+01:00] view original

Well, I guess it is only the tip of the iceberg. However, I have made even set up super-minimal test environments and I'm still struggling with the GC. Here's an example:

import algorithm, system/ansi_c, base64, bitops, hashes, httpClient, json, macros, math, md5, oids, os
import osproc, parsecsv, parseutils, random, re, segfaults, sequtils, sets, std/editdistance
import std/sha1, streams, strformat, strutils, sugar, unicode, tables, terminal
import times, uri

type
    #[----------------------------------------
        C interface
      ----------------------------------------]#
    
    yy_buffer_state {.importc.} = ref object
        yy_input_file       : File
        yy_ch_buf           : cstring
        yy_buf_pos          : cstring
        yy_buf_size         : clong
        yy_n_chars          : cint
        yy_is_our_buffer    : cint
        yy_is_interactive   : cint
        yy_at_bol           : cint
        yy_fill_buffer      : cint
        yy_buffer_status    : cint

# Parser C Interface

proc yyparse(): cint {.importc.}

proc yy_scan_string(str: cstring): yy_buffer_state {.importc.}
proc yy_switch_to_buffer(buff: yy_buffer_state) {.importc.}
proc yy_delete_buffer(buff: yy_buffer_state) {.importc.}

var yyfilename {.importc.}: cstring
var yyin {.importc.}: File
var yylineno {.importc.}: cint

type
    ParamKind = enum
        RegParam, NumParam
    Param = object
        case kind: ParamKind:
            of RegParam: reg: int
            of NumParam: num: int
    
    Statement = ref object
        cmd: int
        params: seq[Param]
    
    StatementList = ref object
        list: seq[Statement]

var
    MainProgram {.exportc.} : StatementList
    A,B,C,D,E,F,G,H: int
    Regs : seq[int] = newSeq[int](8)
    Stack : seq[int]

template benchmark*(benchmarkName: string, code: untyped) =
    block:
        let t0 = epochTime()
        code
        let elapsed = epochTime() - t0
        let elapsedStr = elapsed.formatFloat(format = ffDecimal, precision = 3)
        echo "CPU Time [", benchmarkName, "] ", elapsedStr, "s"

proc newStm(cmd: int): Statement {.exportc.} =
    Statement(cmd: cmd, params: @[])

proc newStmReg(cmd: int, r0: int): Statement {.exportc.} =
    Statement(cmd: cmd, params: @[Param(kind:RegParam, reg: r0)])

proc newStmNum(cmd: int, n0: int): Statement {.exportc.} =
    Statement(cmd: cmd, params: @[Param(kind:NumParam, num: n0)])

proc newStmRegReg(cmd: int, r0: int, r1: int): Statement {.exportc.} =
    Statement(cmd: cmd, params: @[Param(kind:RegParam, reg: r0),Param(kind:RegParam, reg: r1)])

proc newStmRegNum(cmd: int, r0: int, n1: int): Statement {.exportc.} =
    Statement(cmd: cmd, params: @[Param(kind:RegParam, reg: r0),Param(kind:NumParam, num: n1)])

proc newStmRegRegNum(cmd: int, r0: int, r1: int, n2: int): Statement {.exportc.} =
    Statement(cmd: cmd, params: @[Param(kind:RegParam, reg: r0),Param(kind:RegParam, reg: r1),Param(kind:NumParam, num: n2)])

proc newStmRegNumNum(cmd: int, r0: int, n1: int, n2: int): Statement {.exportc.} =
    Statement(cmd: cmd, params: @[Param(kind:RegParam, reg: r0),Param(kind:NumParam, num: n1),Param(kind:NumParam, num: n2)])

proc newStmListWithStm(stm: Statement): StatementList {.exportc.} =
    StatementList(list: @[stm])

proc addStmToList(lst: StatementList, stm: Statement) {.exportc.} =
    GC_ref(lst)
    lst.list.add(stm)

proc setStatements(stms: StatementList) {.exportc.} =
    MainProgram = stms

template R0():untyped {.dirty.} =
    Regs[stm.params[0].reg]

template R1():untyped {.dirty.} =
    Regs[stm.params[1].reg]

template R2():untyped {.dirty.} =
    Regs[stm.params[2].reg]

template N0():untyped {.dirty.} =
    stm.params[0].num

template N1():untyped {.dirty.} =
    stm.params[1].num

template N2():untyped {.dirty.} =
    stm.params[2].num

template IS_REG(x: int):bool {.dirty.} =
    stm.params[x].kind == RegParam

template IS_NUM(x: int):bool {.dirty.} =
    stm.params[x].kind == NumParam

when isMainModule:
    let scriptPath = commandLineParams()[0]
    
    yylineno = 0
    yyfilename = scriptPath
    
    #Reg = initTable[string,int]()
    
    discard open(yyin, scriptPath)
    MainProgram = StatementList(list: @[])
    #setupForeignThreadGc()
    benchmark "parse:":
        discard yyparse()
        #tearDownForeignThreadGc()
    
    benchmark "execute:":
        var line: int = 0
        
        while line<MainProgram.list.len:
            var stm = MainProgram.list[line]
            {.computedGoTo.}
            case stm.cmd
                of 0:
                    if IS_REG(1): R0 = R1
                    else: R0 = N1
                of 1: inc(R0)
                of 2:
                    if IS_REG(1) and IS_REG(2): R0 = R1 + R2
                    else:
                        if IS_REG(1): R0 = R1 + N2
                        else:
                            if IS_REG(2): R0 = N1 + R2
                            else: R0 = N1 + N2
                of 3:
                    if IS_REG(1) and IS_REG(2): R0 = R1 - R2
                    else:
                        if IS_REG(1): R0 = R1 - N2
                        else:
                            if IS_REG(2): R0 = N1 - R2
                            else: R0 = N1 - N2
                of 4:
                    if IS_REG(1) and IS_REG(2): R0 = R1 * R2
                    else:
                        if IS_REG(1): R0 = R1 * N2
                        else:
                            if IS_REG(2): R0 = N1 * R2
                            else: R0 = N1 * N2
                of 5:
                    if IS_REG(1) and IS_REG(2): R0 = R1 div R2
                    else:
                        if IS_REG(1): R0 = R1 div N2
                        else:
                            if IS_REG(2): R0 = N1 div R2
                            else: R0 = N1 div N2
                of 6:
                    if IS_REG(1) and IS_REG(2): R0 = R1 mod R2
                    else:
                        if IS_REG(1): R0 = R1 mod N2
                        else:
                            if IS_REG(2): R0 = N1 mod R2
                            else: R0 = N1 mod N2
                of 7:
                    if IS_REG(0):
                        echo R0
                    else:
                        echo N0
                of 8:
                        if IS_REG(1):
                            if R0==R1: line = N2-1; continue
                        else:
                            if R0==N1: line = N2-1; continue
                of 9:
                        if IS_REG(1):
                            if R0!=R1: line = N2-1; continue
                        else:
                            if R0!=N1: line = N2-1; continue
                of 10:
                        if IS_REG(1):
                            if R0>R1: line = N2-1; continue
                        else:
                            if R0>N1: line = N2-1; continue
                of 11:
                        if IS_REG(1):
                            if R0<R1: line = N2-1; continue
                        else:
                            if R0<N1: line = N2-1; continue
                of 12:
                        if IS_REG(1):
                            if R0>=R1: line = N2-1; continue
                        else:
                            if R0>=N1: line = N2-1; continue
                of 13:
                        if IS_REG(1):
                            if R0<=R1: line = N2-1; continue
                        else:
                            if R0<=N1: line = N2-1; continue
                of 14:
                    line = N0-1
                    continue
                of 15:
                        Stack.add(R0)
                of 16:
                        R0 = Stack.pop()
                else: discard
            inc(line)
        echo "StatementCount: ", MainProgram.list.len

The only tricky part about the above is that it interacts with Flex/Bison. When the source code was in D, all I'd have to do is the equivalent of GC_ref for the 2 object I'm GC_ref -ing here.

Now, in Nim, the example works fine with the default GC (although I fail to see how it frees any memory tbh), but when switching to Boehm, in a large sample input (900k+ lines), 90% of it seems to have disappeared, while at times it might work.

Araq (orginal) [2019-11-27T12:37:10+01:00] view original

Now we are getting there:

type
    #[----------------------------------------
        C interface
      ----------------------------------------]#
    
    yy_buffer_state {.importc.} = ref object
        yy_input_file       : File
        yy_ch_buf           : cstring
        yy_buf_pos          : cstring
        yy_buf_size         : clong
        yy_n_chars          : cint
        yy_is_our_buffer    : cint
        yy_is_interactive   : cint
        yy_at_bol           : cint
        yy_fill_buffer      : cint
        yy_buffer_status    : cint

# Parser C Interface

proc yyparse(): cint {.importc.}

proc yy_scan_string(str: cstring): yy_buffer_state {.importc.}

is wrong too as the yy_buffer_state is not a ref, it's a ptr!

drkameleon (orginal) [2019-11-27T13:05:37+01:00] view original

Unfortunately, I fixed that too but without much success.

Araq (orginal) [2019-11-27T13:14:28+01:00] view original

Eventually the fixes add up and then it will work. ;-)

drkameleon (orginal) [2019-11-27T13:27:47+01:00] view original

Well, speaking of... I guess it's my lucky day. Also, the motto that says that "only when you ask a question, do you fully understand the issue" seems rather confirmed.

So... I'm very very close to have it all working as before (=before attempting to turn the GC on). There were missing bits here and there (quite difficult to list them all here since they're too project-specific), but I think I will manage to fine-tune it.

For now, I'm not extremely happy with the timings (I run a set of benchmarks to see how it compares with other interpreted language interpreters and incorporating the GC slowed it down a bit in some of the cases - but regarding memory management, I'm more than happy!)

Thanks ;-)

drkameleon (orginal) [2019-11-27T13:35:34+01:00] view original

For the sake of comparison, here are the results (the tests are micro-benchmarks for isolated parts I'm trying to check, but very intensive ones for that matter).

--gc:none

https://gist.github.com/drkameleon/099c1d373367877fc3bc8c48067ae09f

--gc:refc

https://gist.github.com/drkameleon/caa1a3d5087dcabfff8a8158d18e9011

mratsim (orginal) [2019-11-28T10:15:19+01:00] view original

I don't know where your bottlenecks are but the if/else statements inside the opcodes dispatch looks incredibly suspicious for performance but then I didn't write a register VM yet so take it with a grain of salt.

Maybe you should have specialized switch:

fully in registers

one in reg and one in not for associative operations

fallback

And a dispatcher that send to those.

I'm pretty sure your interpreter is struggling with branch mispredictions here.

drkameleon (orginal) [2019-11-28T10:46:25+01:00] view original

The VM code above is just an experiment.

For now, the interpreter is a plain tree-walking interpreter.

It sure has its issues and I'm trying to find my way around, but things are getting better and better and the results so far are more than satisfying :)

Mirror of forum.nim-lang.org

5591 :: setupForeignThreadGc() equivalent for Boehm GC?