Hello,
I have started a tutorial about metaprogramming and macro development a long time ago. To explain the changes in the abstract syntax tree of a Nim code by a macro written in Nim, I tried to output a graphical representation of a Nim's AST with TikZ which is the engine I am most familiar with (GraphViz being an alternative).
Has anyone been interested in a graphical representation of a Nim's AST and tried to automate the output?
Here is the Nim code:
dumpTree:
type
myObject {.packed.} = ref object of RootObj
left: seq[myObject]
right: seq[myObject]
The corresponding AST (of the code within the dumpTree block of course):
StmtList
TypeSection
TypeDef
PragmaExpr
Ident "myObject"
Pragma
Ident "packed"
Empty
RefTy
ObjectTy
Empty
OfInherit
Ident "RootObj"
RecList
IdentDefs
Ident "left"
BracketExpr
Ident "seq"
Ident "myObject"
Empty
IdentDefs
Ident "right"
BracketExpr
Ident "seq"
Ident "myObject"
Empty
Here is the TikZ manual conversion:
\documentclass{standalone}
\usepackage{tikz}
\usetikzlibrary{trees}
\begin{document}
\tikzstyle{every node}=[draw=black, rectangle, fill=teal]
\tikzstyle{Ident}=[thick, fill=orange]
\begin{tikzpicture}[%
sibling distance=25mm,
level 6/.style={sibling distance=50mm},
level 7/.style={sibling distance=57mm},
level 8/.style={sibling distance=20mm},
edge from parent fork down
]
\node {StmtList}
child { node {TypeSection}
child { node{TypeDef}
child { node {PragmaExpr}
child { node [Ident] {"myObject"}}
child { node {Pragma}
child { node [Ident] {"packed"} }
}
}
child { node {Empty}}
child { node {Refty}
child { node {ObjectTy}
child { node {Empty}
child { node {OfInherit}
child { node [Ident] {"RootObj"} }
}
child { node {RecList}
child { node {IdentDefs}
child { node [Ident] {"left"} }
child { node {BracketExpr}
child { node [Ident] {"seq"} }
child { node [Ident] {"myObject"} }
}
child { node {Empty} }
}
child { node {IdentDefs}
child { node [Ident] {"right"} }
child { node {BracketExpr}
child { node [Ident] {"seq"} }
child { node [Ident] {"myObject"} }
}
child { node {Empty} }
}
}
}
}
}
}
};
\end{tikzpicture}
\end{document}
The result !
It was pretty tedious to find the correct sibling distance in the tree by hand. Do you think it could be automatically derived from the length of the text in the tree nodes?
Such a visual tool could help to debug macros, it would be cool if we could highlight the tree changes in some way.
I think it would be crazy hard to do it in CLI but I am no expert and I would not mind if someone contradicts me :)
Once the .tex is written, it is not hard to produce a SVG from it. I use ImageMagick's convert CLI to convert from pdf to jpg, but we can also convert from pdf to svg.
I put this example in ChatGPT (3.5, Web) and I am shocked how easily it can convert any AST I provide to him in TikZ. It does not get the sibling distance always right, but I did not asked him to.
I would not be surprised if it could write me the converter AST->TikZ too 😅.
I tried to use GraphViz too but it packs nodes with similar names, so I get a digraph and not a tree.
another option would be to use mermaid.js. here is an example of using mermaid.js with nimib: https://pietroppeter.github.io/nblog/drafts/mermaid_diagram.html (this will be made some point available as a library)
picking a simpler AST like the one generated by this:
import macros
dumpTree:
if 0 == 1:
echo "🤷"
which is:
StmtList
IfStmt
ElifBranch
Infix
Ident "=="
IntLit 0
IntLit 1
StmtList
Command
Ident "echo"
StrLit "🤷"
and translating to the following mermaid specification:
---
title: Nim AST
---
flowchart TB
id0[StmtList] --> IfStmt
IfStmt --> ElifBranch
ElifBranch --> Infix
Infix --> id1["Ident #quot;==#quot;"]
Infix --> id2[IntLit 0]
Infix --> id3[IntLit 1]
ElifBranch --> id4[StmtList]
id4 --> Command
Command --> id5["Ident #quot;echo#quot;"]
Command --> id6["StrLit #quot;🤷#quot;"]
(you need to be careful about generating ids and escaping quotes)
the output would be:
see also in live editor
(nodes can be styled so you could have terminal nodes of a different color)
I really like the output you have here, reminds me a bit of the graphs npeg can create. You seem to have a slight bug though, the Empty child of the ObjectTy should be a sibling to OfInherit and RecList, not their parent.
Not sure why finding the sibling relationships are so hard, isn't it just a matter of counting the spaces that make up the indentation? On another note it might be easier to parse the output of dumpLisp rather than dumpTree.
It made for a fun morning coffee problem to write an AST to TikZ macro. :)
Note that I use latexdsl https://github.com/Vindaar/latexdsl to directly compile the thing (and I made a local change to allow handing a custom template. By default it uses this https://github.com/Vindaar/LatexDSL/blob/master/src/latexdsl/latex_compiler.nim#L10-L20 ). I wanted to make more use of it initially, but then changed to just emitting strings manually. As @PMunch already mentions, the manual conversion is slightly wrong. I took out the sibling distances, because they don't match the real graph.
I'm not sure how the level syntax is supposed to be understood so I didn't attempt to compute the distances manually. I guess it should be doable, but may be tricky to get right.
import strutils
const tmpl = """
\documentclass{standalone}
\usepackage{tikz}
\usetikzlibrary{trees}
\begin{document}
\tikzstyle{every node}=[draw=black, rectangle, fill=teal]
\tikzstyle{Ident}=[thick, fill=orange]
\begin{tikzpicture}[%
sibling distance=25mm,
% TODO: think about sibling dstances
edge from parent fork down
]
$#
\end{tikzpicture}
\end{document}
"""
import macros
proc removePrefix(s, p: string): string =
result = s
result.removePrefix(p)
proc node(n: NimNode): string =
let nK = n.kind.repr.removePrefix("nnk")
case n.kind
of nnkIdent: "node [Ident] {\"" & n.strVal & "\"} "
else:"node {" & nK & "} "
proc child(s: string): string = "child { $# } " % s
proc child(n: NimNode): string =
let nStr = n.repr
result = child(nStr)
proc astToTikz(n: NimNode): string =
case n.kind
of nnkLiterals, nnkIdent, nnkSym:
result = node(n)
else:
result = node(n)
for ch in n:
result.add child(astToTikz(ch))
macro toTikZ(n: untyped): untyped =
let data = "\\" & astToTikz(n) & ";"
result = newLit(data)
let str = toTikZ:
type
myObject {.packed.} = ref object of RootObj
left: seq[myObject]
right: seq[myObject]
import latexdsl # to compile directly
compile("/t/tikz_stuff.tex", str, tmpl)
@pietroppeter Mermaid.js has a very similar syntax to GraphViz, we need to handle ids and escape quotes, but is pretty nice (and has an editor!). Thanks for the proposal!
@PMunch Thanks for spotting the error! Of course, an Empty Node can not have a Child. I could put them in another color. This is very far from NPEG graphs quality !
@Vindaar I wish my coffee sessions were all that much productive! Do you have a patreon? Here are the siblings levels that make blocks not overlap:
\tikzstyle{every node}=[draw=black, rectangle, fill=teal]
\tikzstyle{Ident}=[thick, fill=orange]
\begin{tikzpicture}[%
sibling distance=25mm,
level 3/.style={sibling distance=50mm},
level 4/.style={sibling distance=25mm},
level 5/.style={sibling distance=50mm},
level 6/.style={sibling distance=55mm},
level 7/.style={sibling distance=20mm},
level 8/.style={sibling distance=18mm},
edge from parent fork down
]
Luckily, I did not have a t directory at root
compile("./t/tikz_stuff.tex", str, tmpl)
Haha, (going by your github profile) they will be once you get to the end of that PhD. ;)
On a serious note though, I'm glad it was helpful. It wasn't particularly complicated for me as I've written similar code enough times already.
Do you have a patreon?
Nope, I don't. :)
Luckily, I did not have a t directory at root
Heh, that's just an alias for /tmp that snuck in there. @c-blake is the one that made me start using it. Gotta save those 2 chars. :D
Throwing my hat into the ring, the simple code:
import strutils
var
nodeName = "a"
lastIndent = 0
parents: seq[string]
echo "digraph {"
for line in test.splitLines:
if parents.len == 0:
echo nodeName, "[label=\"", line.strip, "\", shape=rect]"
else:
var thisIndent = (line.len - line.strip(trailing = false, chars = {' '}).len) div 2
if thisIndent == lastIndent:
parents.setLen(parents.len - 1) # Discard old non-parent node
elif thisIndent < lastIndent:
parents.setLen(parents.len - 1 - (lastIndent - thisIndent)) # Discard old non-parent node and last parent
lastIndent = thisIndent
echo nodeName, "[label=\"", line.strip.replace("\"", "'"), "\", shape=rect]"
echo parents[parents.high], " -> ", nodeName
parents.add nodeName
var pos = nodeName.high
while true:
nodeName[pos] = if nodeName[pos] == 'z': 'a' else: chr(nodeName[pos].ord + 1)
if nodeName[pos] == 'a':
dec pos
if pos == -1:
nodeName.add 'a'
break
else:
break
echo "}"
generates a Graphviz document that looks like this for the code you gave above.
3) The problem with your styling is that it considers only nodes that are strictly parents (and childrens of no nodes). In my example, I distinguish the leaves from the inner nodes! Here is the output for your second example: https://bit.ly/3rz1t7d
I really like your conversion to GraphViz which handles automatically the placement of nodes.
@geohuz Thanks for the layout engine recommendation!