nimforum mirror - Capturing a context variable in public procs

spip (orginal) [2020-11-10T01:52:46+01:00] view original

I'm working with an external library that requires passing a context variable to all library calls. As I'm writing a DSL to ease using this library, I'd rather hide this context variable to the user of the DSL. Is there a way to achieve like a closure capture variable for public procs?

For instance, the following DSL code

math:
  a = (b + c) * d

would be translated to something like:

template math*(body: untyped) =
  block:
    let ctx {. inject .} = initLib()
    `body`

proc `+`*(a, b: Mat): Mat =
  result = libAdd(ctx, a, b)

proc `*`*(a, b: Mat): Mat =
  result = libMult(ctx, a, b)

where the ctx variable is captured in procs + and *, resulting in

block:
    let ctx {. inject .} = initLib()
    a =  libMult(ctx, libAdd(ctx, b, c), d)

if the two procs were templates.

I've been using templates to achieve the result but 1) templates are bugged when dealing with "generics" syntax 2) templates can give unexpected results when arguments with side effects are evaluated multiple times, 3) though templates are evaluated a compile time, syntax errors in the code will stay hidden until the template is used, 4) the type of the template is known only when it is evaluated, 5) debugging templates is more difficult than standard code, etc. See the wiki for risks of using templates...

Ideally, because the DSL code can be used in multiple procs or even in recursive code, I need to capture the latest ctx in the math scope. If I can declare inner closure procs as public, it would do the job... but it can't be done.

Is it possible to do what I want to do?

slonik_az (orginal) [2020-11-10T03:08:32+01:00] view original

Nim's stdlib has with macro that seems to be what you need

https://nim-lang.org/docs/with.html#with.m%2Ctyped%2Cvarargs%5Buntyped%5D

mratsim (orginal) [2020-11-10T10:22:32+01:00] view original

It's possible to not use templates if you define your DSL in the math body:

template math*(body: untyped) =
  block:
    proc `+`*(a, b: Mat): Mat =
      result = libAdd(ctx, a, b)
    
    proc `*`*(a, b: Mat): Mat =
      result = libMult(ctx, a, b)
    
    let ctx {. inject .} = initLib()
    `body`

In Arraymancer I chose to just use ctx.network as a template/macro parameter

https://github.com/mratsim/Arraymancer#handwritten-digit-recognition-with-convolutions

network ctx, DemoNet:
  layers:
    x:          Input([1, 28, 28])
    cv1:        Conv2D(x.out_shape, 20, 5, 5)
    mp1:        MaxPool2D(cv1.out_shape, (2,2), (0,0), (2,2))
    cv2:        Conv2D(mp1.out_shape, 50, 5, 5)
    mp2:        MaxPool2D(cv2.out_shape, (2,2), (0,0), (2,2))
    fl:         Flatten(mp2.out_shape)
    hidden:     Linear(fl.out_shape, 500)
    classifier: Linear(500, 10)
  forward x:
    x.cv1.relu.mp1.cv2.relu.mp2.fl.hidden.relu.classifier

The main difference is that the input x also carry the context ctx in its data structure so that I don't have to pass a ctx.add(a, b) every time. https://github.com/mratsim/Arraymancer/blob/4ae9b811/src/arraymancer/autograd/autograd_common.nim#L44-L70

That said, I'm not that happy with the setup so I plan to write a full blown math DSL using a compiler approach. A simple expression for matrix multiplication would look like

proc matmul(A, B: Function): Function =
  ## Generator of A * B matrix multiplication function
  var i, j, k: Domain
  
  # The "what"
  # Definition of the result function
  C[i, j] = A[i, k] * B[k, j]
  
  # The "How"
  # Optional tips for high-performance computing depending on GPU or CPU
  when defined(cuda):
    # Separate onto 256 iterations and launch a cuda thread for each
    C.unroll(i, 256)
      .parallel()
      ...
  else:
    # Iterate on blocks of 96 j, vectorize them using assembly and parallelize over i
    C.tile(j, 96)
      .vectorize
      .parallel(i)
      ...
  
  # Return
  return C # Matrix multiplication

# `generate` concretizes this definition (the what) and schedule (the how)
generate foobar:
  proc foobar(a: Tensor[float32], b, c: Tensor[float32]): Tensor[float32]

Something with just the "what" is already implemented in Einsum:

https://github.com/mratsim/Arraymancer/blob/4ae9b81/src/arraymancer/tensor/einsum.nim#L512-L525

# implicit Einstein summation
let c = einsum(a, b):
  a[i,j] * b[j,k]
# explicit Einstein summation. Note that identifier `d` in statement
# is arbitrary and need not match what will be assigned to.
let d = einsum(a, b):
  d[i,k] = a[i,j] * b[j,k] # explicit Einstein summation

Alternatively you might want to change the way your DSL is done, I have a couple of experiments here for computation graph DSL: https://github.com/mratsim/compute-graph-optim

For example using a tagless final approach, you define operations and interpreters for those operations and the "eval" interpreter can carry your context. https://github.com/mratsim/compute-graph-optim/blob/master/e05_typed_tagless_final.nim

type
  Expr[Repr] = concept x, type T
    lit(T) is Repr[T]
    `+`(Repr[T], Repr[T]) is Repr[T]
  
  Id[T] = object
    val: T
  
  Print[T] = object
    str: string

func lit[T](n: T, Repr: type[Id]): Id[T] =
  Id[T](val: n)

func `+`[T](a, b: Id[T]): Id[T] =
  Id[T](val: a.val + b.val)

func lit[T](n: T, Repr: type[Print]): Print[T] =
  Print[T](str: $n)

func `+`[T](a, b: Print[T]): Print[T] =
  Print[T](str: "(" & $a.str & " + " & $b.str & ")")

func foo(Repr: type): Repr =
  result = lit(1, Repr) + lit(2, Repr) + lit(3, Repr)

echo foo(Id).val     # <----- Use a context here if needed
echo foo(Print).str

Other techniques I explored include object algebra, attribute grammars, visitor pattern, catamorphisms, functional lenses, transducers or the compiler approach I took (and you are leaning into) using shallow embeddings (user-defined functions) composed from deep embeddings (core math functions based optimized implementations, your library in your case).

spip (orginal) [2020-11-10T19:34:30+01:00] view original

From what I understand, I will have to go with the macro route and manage the full DSL language syntax. I have to think about it before jumping as my DSL can be mixed with Nim code and only the calls to the library needs to get the context variable (I used a math syntax in my example but the language is more general).

Like H. L. Mencken said, For every complex problem there is an answer that is clear, simple, and wrong.. And using templates to define a large DSL is such a solution...

Thanks for the hints and experience feedback.

spip (orginal) [2020-11-12T03:13:55+01:00] view original

For the record, another simple but wrong solution:

It's possible to not use templates if you define your DSL in the math body:

template math*(body: untyped) =
  block:
    proc `+`*(a, b: Mat): Mat =
      result = libAdd(ctx, a, b)
    
    proc `*`*(a, b: Mat): Mat =
      result = libMult(ctx, a, b)
    
    let ctx {. inject .} = initLib()
    `body`

results in Error: 'export' is only allowed at top level

spip (orginal) [2020-11-12T03:38:17+01:00] view original

Inspired by the with macro, I'm thinking of writing a macro to inject the ctx into all libXXX calls while walking the AST. But in order for the procs to compile, I would have to write them like:

# Fake libXXX API to allow compilation
proc lib2Mult(a, b: Mat): Mat = discard
proc lib2Add(a, b: Mat): Mat = discard

# Real libXXX API
{. pragma: libXXX, importc, dynlib: libName, cdecl .}
proc libMult(ctx: Ctx; a, b: Mat): Mat {. libXXX .}
proc libAdd(ctx: Ctx; a, b: Mat): Mat {. libXXX .}

proc `*`(a, b: Mat): Mat =
  result = lib2Mult(a, b)

proc `+`*(a, b: Mat): Mat =
  result = lib2Add(a, b)

The macro would replace the fake lib2XXX calls by libXXX and inject the ctx variable.

Some thoughts before starting to code:

Adding fake declarations is not really nice. I can try to hide them with a template when I declare the API procs.

Identifying the fake calls must be done with a pattern (lib2.*) and this should help identify the real call (libXXX).

Accessing the ctx value from outside proc bodies will probably require pointer handling, in the unsafe area...

Araq (orginal) [2020-11-12T10:06:07+01:00] view original

Document it well that the convention is to call the parameter ctx and then have:

{. pragma: libXXX, importc, dynlib: libName, cdecl .}
proc libMult(ctx: Ctx; a, b: Mat): Mat {. libXXX .}
proc libAdd(ctx: Ctx; a, b: Mat): Mat {. libXXX .}

template `*`(a, b: Mat): Mat =lib2Mult(ctx, a, b)

template `+`*(a, b: Mat): Mat = lib2Add(ctx, a, b)

template initMat(): Mat = libCreateMat(ctx)

User code:

proc code(ctx: Ctx) =
  var x = initMat()
  var y = initMat()
  echo x + y

There are many other ways though.

slonik_az (orginal) [2020-11-12T18:30:58+01:00] view original

@spip For the record, another simple but wrong solution
Error: 'export' is only allowed at top level

This one is easy to make right. The + and * operators are defined and used inside the block. You do not need to export them. Simply remove the export symbol from them and you are done.

spip (orginal) [2020-11-13T01:33:47+01:00] view original

Yes, that's true. Probably simpler than writing the decorating macro, but the documentation comments from the inner procs won't be taken into account by nim doc? I'm still thinking about the easiest syntax for users...

slonik_az (orginal) [2020-11-14T01:54:35+01:00] view original

Attach all your documentation to the exported template math. Problem solved.

spip (orginal) [2020-11-14T03:09:13+01:00] view original

I've simplified my problem to present it in the forum. Presently, I'm not using a template but a macro because configuration options must be processed before creating the context variable. And I've a few hundreds user functions, split in multiple files that I have to include. Documenting them all in the math doc comment is not that simple!:-(

Mirror of forum.nim-lang.org

7063 :: Capturing a context variable in public procs