I'm thinking about porting PetitParser to Nim - if you don't know it, it's a parser combinator library, written by the original author Lukas Renggli in two version, one in Dart and another in Java.
I wrote a working port of PetitParser in PHP, and I also write a semi-broken (unreleased) port in TypeScript.
The primary fascination for me with this approach to parsers, is simplicity - not only of the parser library, but of the parsers implemented.
For example, here's a complete JSON grammar in 70 lines and parser in under 50 lines of Dart:
https://github.com/petitparser/dart-petitparser/tree/master/lib/src/json
Not only is it brief and simple, it is really easy to understand.
For comparison, the JSON parser in Nim is 1100 lines.
I figured porting this would be, if nothing else, a worthwhile exercise and a hands-on experience with Nim.
I'd like to hear your thoughts and opinions though, as to whether this could be (or needs to be) optimized using templates, macros and/or inline procs?
So the point is not to write a simple source-to-source port, but rather to attempt to compose the combinators at compile-time, if possible - yielding a grammar and parser approach that is at least as fast (if not faster) as the current approach to the JSON parser, while hugely simplifying implementation.
I read a remark somewhere here on the forum stating that the JSON parser did not stand up when benchmarked against JSON parsers in various other languages, so I wonder if there is room for general improvements in this area over the current approach.
I'd love to get your thoughts/input :-)
Hey!
I am a longtime Smalltalker and I even chatted with Lukas Renggli about PP and which one he felt I should use for the port. I have used PP in Smalltalk and love it. Lukas said the Java port is probably the one that is easiest to port to a statically typed lang like Nim - he is also updating it right now.
So I have started this, but I am doing a "straight port", mainly to learn more Nim and see if it can easily handle a codebase like PP.
I can share what I have so far, feel free to contact me (gokr) at #nim or on email, goran (at) 3dicc.com.
@gokr My PHP and TypeScript ports were primarily referencing the Dart version, which I like better in some ways - however, I often cross-referenced with the Java version, and I do highly recommend you cross-reference both implementations if you've ever in doubt. Some aspects of either the Java or Dart version may be better suited for porting - it's not only a matter of language similarity, different constraints and different styles often come into play in different languages, so both may have valuable ideas or solutions to various problems :-)
@Varriount I glanced at Codetalker - I'm not fond of mixing languages, especially when the other language is C ;-) ... have you used PetitParser?
@mindplay Have you seen Marpa? It's incredibly powerful, it can parse any language that can be described in BNF. http://jeffreykegler.github.io/Marpa-web-site/
It wouldn't be nearly as easy to write a libmarpa wrapper though, and libmarpa is written in C.
It may also be interesting to consider Parboiled. Parboiled is a Scala library of parser combinators, and it should be similar to PetitParser in that it implements Packrat parsing for Parsing Expression Grammars.
Scala, like Nim, has a flexible syntax and good support for generics, so it may be relatively straightforward to port it. What's more, there are two versions: Parboiled 1 and Parboiled 2, the difference being exactly that in the second version the author has tried to take advantage of Scala macros (which were nonexistent at the time of writing Parboiled 1) to implement as much as the parser construction as possible at compile time - this is an approach that may be fruitful for Nim as well, and being able to follow the evolution between the two versions may be a plus