nimforum mirror - Emit C++ code into .h (instead of .cpp) and reuse it across NIM modules

drywolf (orginal) [2018-09-23T01:58:57+02:00] view original

Hi, I have been trying out nim over the last couple of weeks and I am really impressed with what can be done with the meta-programming specifically. My main interest has been wrapping some existing C++ code bases and I have come up with some macros that help me along the way (patching vTables etc.)

I have been using the {.emit.} pragma quite extensively in macros and since I want to re-use those macros within a couple other .nim files/modules I put these utility macros into their separate .nim file: (Please note that this is not the actual code, the real code is not even concerned with sizeof() at all, so please ignore that the purpose of the following code is not very useful. I just tried to produce an as small as possible repro-case of what my code actually does)

cpp_tools.nim

import macros

# The C++ code of this macro should best be emitted once into a .h file
# and then just #included into every NIM generated .cpp file that imports the "cpp_tools.nim" module
macro sizeof_init*(): typed =
    result = nnkPragma.newTree(
        nnkExprColonExpr.newTree(
            newIdentNode("emit"),
            newLit("""
#include <cstdint>

template<typename T>
class CppSizer
{
public:
    static uint32_t getSize()
    {
        return sizeof(T);
    }
};
"""
            )
        )
    )

# This macro generates a specialization of the above C++ template for a given NIM type
# and then also injects a NIM wrapper proc to call the C++ "getSize()" method from NIM
macro register*(base_type: typed): typed =
    result = nnkStmtList.newTree()
    let bt: string = $base_type.symbol
    
    result.add(nnkPragma.newTree(
        nnkExprColonExpr.newTree(
            newIdentNode("emit"),
            newLit("""class """ & bt & """_sizer : public CppSizer<""" & bt & """> {};""")
        )
    ))
    
    # generate NIM proc wrapping the C++ static method
    var super_proc = nnkProcDef.newTree(
        nnkPostfix.newTree( # 0 -> method name
            newIdentNode("*"),
            newIdentNode("sizeof_" & bt)
        ),
        newEmptyNode(), # 1 -> ???
        newEmptyNode(), # 2 -> ???
        nnkFormalParams.newTree( # 3 -> parameters
            newIdentNode("cuint")
        ),
        nnkPragma.newTree( # 4 -> pragmas
            nnkExprColonExpr.newTree(
                newIdentNode("importcpp"),
                newLit(bt & "_sizer::getSize(@)")
            )
        ),
        newEmptyNode(), # 5 -> ???
        newEmptyNode() # 6 -> ???
    )
    
    result.add(super_proc)

main.nim


import cpp_tools

# emits C++ "CppSizer" class into the main.cpp file
# Would be better if it would happen automatically just by importing "cpp_tools"
# and the generated C++ code was only generated once, put in a .h header file and
# just included in all .nim files (i.e. resulting .cpp files) that use the "cpp_tools" module.
sizeof_init()

# Code like the following is declared in multiple different .nim files
# across the project for many different NIM types and uses the generated sizeof_*** procs
type Person* {.pure, final, exportc.} = object
    firstname: string
    lastname: string

register(Person)

echo "C++ ", sizeof_Person()
echo "NIM ", Person.sizeof

I annotated my intents for what the usage of the nim macros and the generated C++ code should look like in the best case in the above code. Is there some way with existing nim macros, pragmas or some other language features that I could achieve what I want ??

Thanks for any hints

mratsim (orginal) [2018-09-23T09:08:39+02:00] view original

I've struggled with this with Cuda C++ templates, for example this: https://github.com/mratsim/Arraymancer/blob/master/src/tensor/private/incl_kernels_cuda.nim.

My solution was to use include instead of import for all "includes/headers" C++ generated code like so: https://github.com/mratsim/Arraymancer/blob/0605c7fcd34e216623a891f4963d2a4ef98882b1/src/tensor/init_copy_cuda.nim#L19-L21

gemath (orginal) [2018-09-23T12:10:35+02:00] view original

We could separate the template definition into a header file and use it from cpp_tools.nim.

cpp_tools.h:

#include <cstdint>

template<typename T>
class CppSizer
{
public:
    static uint32_t getSize()
    {
        return sizeof(T);
    }
};

cpp_tools.nim:

import macros, strutils

macro register*(base_type: typed): typed =
  let stmts = """
{.emit: "/*TYPESECTION*/ class $1_sizer : public CppSizer<$1> {};".}
proc sizeof_$1*(): cuint {.header: "../cpp_tools.h", importcpp: "$1_sizer::getSize(@)".}
  """ % [$base_type.symbol]
  stmts.parseStmt

This version uses parseStmt to produce an AST node directly from a source string. Note that header: "../cpp_tools.h" only works if the nimcache directory is in the same directory as the source files. If a proper cpp_tools package was created and installed, we would probably get away with header: "cpp_tools.h". Maybe setting a compiler option in a .nimble file to add an include directory would be necessary.

drywolf (orginal) [2018-09-23T13:12:28+02:00] view original

Thanks for the suggestions.

@mratsim Your solution seems to work fine for me too, only drawbacks are that the nim modules that want to use cpp_tools are forced to use include rather than import and also all modules depend on those module and so forth (include needs to cascade upwards in the module dependency tree)

Also this in turn means that the c++ template<> code will be generated & repeated in each of those modules. I think it depends on the size of the c++ template<> code if this is a bad thing, since on one hand it saves the c++ compiler from handling any #include directives (because the code is just inlined into all .cpp files directly). On the other hand if the c++ template code might get very complex (i.e. thousands of lines of c++ template code) it might slow down the c++ compiler because of all the duplicate lexing/parsing/etc.

@gemath Thanks for the hint ... I realize I might have made my above example too simple though 😅 ... In my actual code the sizeof_init macro does not just emit a static string literal in its emit pragma ... the actual string is itself composed by some nim logic during compile time. Therefore it can not easily be extracted via copy&paste into a c++ header file. However, might it be possible to generate & save such a file in nim during the compile time anyway via file I/O APIs ? ... I will have to investigate that

Thanks

drywolf (orginal) [2018-09-23T13:34:02+02:00] view original

cpp_tools.nim

static:
    var patcher_code = ""
    # some macro black magic to generate C++ template code into the "patcher_code" variable
    writeFile("nimcache/cpp_tools.h", patcher_code)

... this seems to do the trick just fine, the only thing I don't like about this solution is the hard-coded "nimcache/" path ... afaik nim allows to change the location of that cache directory via --nimcache:PATH ... is there some compile-time API that I could use to query the location of the nimcache directory ?

Thanks

mashingan (orginal) [2018-09-23T13:46:33+02:00] view original

Have you considered source source code filter?

I'm using it to produce Go output file :D

mratsim (orginal) [2018-09-23T20:08:13+02:00] view original

Also this in turn means that the c++ template<> code will be generated & repeated in each of those modules. I think it depends on the size of the c++ template<> code if this is a bad thing, since on one hand it saves the c++ compiler from handling any #include directives (because the code is just inlined into all .cpp files directly). On the other hand if the c++ template code might get very complex (i.e. thousands of lines of c++ template code) it might slow down the c++ compiler because of all the duplicate lexing/parsing/etc.

C++ headers work the same. If you put function implementation in a header file it will be present in all files that include this header, slowing down compilation and bloating the code.

drywolf (orginal) [2018-09-24T13:28:23+02:00] view original

Fully agreed, but I'm coming to nim from c++ mostly because I want to avoid some of the "less-than-optimal" ways that c++ does things. I don't feel comfortable compromising too much in this regard (at least not just yet 😅)

Mirror of forum.nim-lang.org

4220 :: Emit C++ code into .h (instead of .cpp) and reuse it across NIM modules