Hello,
I'm looking for an easier API for pegs that would for easy AST building. PEG.js allows for something like this:
Method
= ident:Ident args:CallArgs { return { type: "call_method", ident, args } }
Unfortunately I don't see anything similar for Nim's built-in pegs module, is something like this possible with macros, or something like this?
I was just playing around with this recently. I was experimenting with walking the parsed grammar in a macro and checking for 'handlers' for the rules defined in the grammar. The macro would then map any found handlers into the existing event loop code.
Part of the reason for doing it that way was to be able to keep the grammar and the actions separated. I haven't done enough with it yet to know if this approach will be workable for real world usage though.
One issue that I hit was that when you try to parse the grammar in a 'static' block for compile time logic, there were some errors (as of 0.19.0). I did some local hacky patches to get around it, but I need to research it a bit more to see what the underlying issue is.
By the way, here is a little more detail on one of the issues that I encountered. In pegs.nim (0.19.2 and devel), in the parsePeg proc, the PegParser var that is created seems to get re-initialized after the init call returns when running in the VM at compile time.
The existing code looks like
var p: PegParser
init(PegLexer(p), pattern, filename, line, col)
# p.buf, etc should now be set from init call, but are not
The workaround that I was using for my previous experiments added a separate PegLexer var to use in the init call, and then afterwards copied the data into the PegParser var.
var p: PegParser
var l: PegLexer
init(l, pattern, filename, line, col)
p.bufpos = l.bufpos
p.buf = l.buf
p.lineNumber = l.lineNumber
p.lineStart = l.lineStart
p.colOffset = l.colOffset
p.filename = l.filename
It's not clear to me why the PegParser var gets re-initialized when running in the VM at compile time, but not in regular code, but that workaround above was at least enough to let me proceed with my tests.
Anyone have any thoughts on why that happens and/or pointers in Nim's source where I might investigate more?
AFAIK, there's no a single way to do it with pegs module. With pegs module you can test whether the string satisfies the definition and you should code what to do with captured instance
for example
import pegs
var
ops = peg"\skip (\s*) '+' / '-' / '/' / '*'"
prim = peg"\skip (\s*) {\d+} '.' {\d+} / {\d+}
optionalops = sequence(capture ops, prim)
mathpeg = * sequence(prim, * optionalops)
theline = "1 + 1.5 - 2.0 / 3"
if theline =~ mathpeg:
var matches = newseq[string](10) // should allocate it first, if not we will get error
discard theline.find(mathpeg, matches)
echo matches // ["1","+", "-", "1.5", "2.0", "/", "3", "", "", ""]
so you must check which peg definition the line satisfies, get the captured instances, and do something with it
Pegs recently got an "event parser" feature that allows you to exactly this kind of thing: https://nim-lang.org/docs/pegs.html#eventParser.t%2Cuntyped%2Cuntyped
With this you can build a case statement for the different matching grammar elements very easily.
I also spent some time yesterday to see what pegs needs to run at compile time. My first quick hack was the same as chrishellers: drop init() and move its contents into parsePeg.
After that I ran into a few places where the pattern buffer is accessed out of bounds (always len+1). I suspect this is by design, because the PegLexer buf is declared as cstring instead of normal string, which makes this valid because of the trailing '0'. I guess this works fine at run time, but the VM does not like this trick.
Fixing these three or four places results in my peg expression properly being parsed at compile time, but then I ran into this one: Error: invalid type for const: Peg
Unfortunately, I'm lost here. It seems that inherited types can not be used for consts, but I have no clue why, and if there is a possible mitigation for this.