Hello, I am trying to start to understand a little bit of the compiler, in order to fix these bugs, that are pretty much blocking for me.
I would like to know whether there are some resources on the compiler internals, apart from this.
In particular, my current problem is the following. I have a simple example that produces an unexpected output. I put it under tests/concepts/tmonoid.nim, so now I can test it with ./koch test r concepts/tmonoid.nim or run it with a fresh compiler using ./koch temp tests/concepts/tmonoid.nim.
I would like to add some echo or debug statements here and there to understand what is going on. The problem is that:
Namely, I get debug information even in the phase where the compiler compiles itself, leading to a deluge of output. I would like to add some printing statements in such a way that the are ignored when compiling the compiler, but emit some information when compiling my example.
Is there any way to do so?
I answer my own question: I had missed that I can query n.info on a node n to get the source file. The example given in the documentation is
if n.info ?? "temp.nim":
# only output when it comes from "temp.nim"
echo renderTree(n)
if n.info ?? "temp.nim":
# why does it process temp.nim here?
writeStackTrace()
Still interested in knowing whether there are any blog posts or anything of interest about compiler hacking, though
Still a bit of an issue, though. I would like to debug the typeRel proc in sigmatch.nim, but there is not node in sight there, so I do not have access to line information (in particular the file name). Is there a more general way to obtain the name of the file we are compiling?
@yglukhov Your technique would also be fine, but I do not know how to check whether I am currently inside a particular routine
Ok, I finallly found what I needed. Most functions have, directly or indirectly, access to a PContext. That structure, in turn, has a module field of type TSym, which represents (I think) the current module. From there, you have a info field with the line info.
So, in my case, having a TCandidate in scope called c, I can do
if c.c.module.info ?? "tmonoid.nim":
# do something
Yeah but for TCandidate there is also c.call.info. ;-)
One thing that nobody knows is that the compiler has an ID mechanism for nodes if you enable define:useNodeIds in compiler/nim.cfg (for technical reasons that cannot be enabled via koch). Then you can print the offending id of the node that you think contains bogus data.
Say this id is 1234 then edit ast.nim and set const nodeIdToDebug* = 1234. Then rerun koch temp and stack traces are produced which part in the compiler created node with id = 1234.
However, often the bogus node is just a copy of the real bogus node. The stack trace then contains COMES FROM 2345 (for example). So ... we repeat the step, set nodeIdToDebug to 2345 and rerun koch temp until we find the root cause of the bogus node.
Debugging this way sucks but it works better for me than anything else that I've tried over the years including "travel backwards in time" debuggers. ;-)
There really should be more resources for hacking and debugging the compiler. I've also found it quite difficult to get into.
I would suggest having an effort after 1.0 to clean up, comment, add documentation, and perhaps add more debug tools. That should make it easier for others to contribute to compiler development.
If there's a clean-up I would also suggest splitting the compiler into a library (a nimble one prefereable) and the nim tool and nimsuggest. That way it might be easier for others to write new tools that use the parser and compiler.