Dear @Araq,
Several months ago, I got bit by the programming language bug after trying to learn Nim.
I think, Nim is one of the best progress we have made in the area of programming languages, with such simple and elegant syntax but being a language really powerful with low-level capabilities. This was a big surprise to me to find out that a language need not be verbose and complicated like Rust to be a systems programming language.
This created an interest in me to learn about compilers, interpreters, and other language tools.
However, I am a person who does not possess a CS-degree (or Math) and have been an IT professional only for about 3 years coming from the legal profession.
The reason why I am reaching out to you here is, first and foremost, because I do not know how to reach you privately.
Secondly, if you get time, could you give an outline of a roadmap of things I should learn in an incremental manner to reach your level of knowledge and proficiency.
At least, what would be the basic areas in which I need to gain solid understanding? Data structures? Algorithms? Computer Organisation? Math? Assembly programming? Type theory? Functional programming?
I'm sure, if you were advising your younger self, you would have advised to stay away from learning certain things that are actually irrelevant but taught in the area of creating programming languages.
Then, do I need to gain professional systems programming experience before I embark on a compiler engineer journey? And do I need professional experience in C or C++ or Assembly programming? I was primarily a web developer for most of my short IT career having decent exposure to JavaScript (and now a bit of Python). I have also learnt the very basics of C.
It would have been nice if there a roadmap like roadmap.sh. However, I'm sure you wouldn't have time for that.
Do ADTs, pattern matching etc. make a world of a difference (I've been seeing this a lot on forums)? Is Nim a suitable good language to use as a bootstrapping language for a compiler or write an interpreter? Many in the Programming Languages subreddit does not suggest Nim as a suitable option (they cite the lack of sum types and patter matching in Nim) and instead suggest Haskell, OCaml, or Rust.
Of course, I would take your word above theirs given the fact that you have sufficient experience in actually making a programming language in this era which is used in production.
Sorry, if this in inappropriate to raise in this forum; but as I mentioned, I do not know of any place to reach out to you privately.
Hi bajith,
I recently stumbled upon this: My First Language Frontend with LLVM Tutorial
Maybe that could serve as a starting point.
Hi sei,
Thanks for this. I had heard of this tutorial from the sub-reddit I mentioned above.
Actually, I find it easy now to get a hold of these resources. They do give me a "practical" exposure to language development.
However, I was asking @Araq to understand what kind of theoretical base should I build or make sure I have before I embark on language development. I've been advised by some to learn things as they come given the fact that many are like me who are totally new to lang-dev.
Since @Araq has done that well, I wanted gain from him a roadmap that I can use; also to avoid learning things that are actually not essential for lang-dev (I don't know whether he'll have time for my query though).
However, thank you so much for pointing me to the above resource; much appreciated.
Read this:
https://people.inf.ethz.ch/wirth/CompilerConstruction/CompilerConstruction1.pdf
https://people.inf.ethz.ch/wirth/CompilerConstruction/CompilerConstruction2.pdf
And then "Modern compiler implementation in Java/ML/C" by Andrew W. Appel and Jens Palsberg.
These are not very math heavy but you should be able to write a recursion, for example a recursive directory traversal. The primary job of a compiler is the elimination of recursions which is easier if you can use recursion within a compiler.
Secondly, if you get time, could you give an outline of a roadmap of things I should learn in an incremental manner to reach your level of knowledge and proficiency.
Sorry, but you will not reach my level of knowledge by reading books. You need to experiment and learn from your mistakes. Lots of experiments. Lots of mistakes.
At least, what would be the basic areas in which I need to gain solid understanding? Data structures? Algorithms? Computer Organisation? Math? Assembly programming? Type theory? Functional programming?
All of these don't hurt but it's easy to get lost in them.
I'm sure, if you were advising your younger self, you would have advised to stay away from learning certain things that are actually irrelevant but taught in the area of creating programming languages.
Irrelevant: Lexer and parser generators.
Irrelevant: Code generator generators like Burg.
Harmful: OOP for compiler plus the visitor pattern.
Harmful: Haskell with its obsession of making mutable state inconvenient to use.
Harmful: Erlang for writing a compiler.
Harmful: Anything that uses dynamic typing.
Harmful: C. If you want to learn about "low level", learn an assembler. It doesn't distract with historical baggage like "C's 50 ways to offer undefined behavior" or "C's flawed operator precedence rules" or "C's header files and preprocessor".
Then, do I need to gain professional systems programming experience before I embark on a compiler engineer journey?
I wrote preprocessors and interpreters for domain specific languages before I wrote compilers and can recommend it. A preprocessor is much simpler than a compiler and can be inherently useful on its own.
And do I need professional experience in C or C++ or Assembly programming? I was primarily a web developer for most of my short IT career having decent exposure to JavaScript (and now a bit of Python). I have also learnt the very basics of C.
C and C++ are more harmful than useful and you only need assembler if you want to write a compiler that produces assembler.
Do ADTs, pattern matching etc. make a world of a difference (I've been seeing this a lot on forums)? Is Nim a suitable good language to use as a bootstrapping language for a compiler or write an interpreter? Many in the Programming Languages subreddit does not suggest Nim as a suitable option (they cite the lack of sum types and patter matching in Nim) and instead suggest Haskell, OCaml, or Rust.
ADTs and pattern matching are awesome so use a Nim macro library that offers them. Yes, use Nim to write a compiler, it works. Forget about OCaml, Haskell and Rust. If you want an alternative to Nim that offers sum types and pattern matching, use F#. Has the much nicer syntax.
Thank you so much @Araq! This means a lot to me.
If I may take your time for one last thing regarding this; is experience in systems programming advisable before one embarks on compiler/parser engineering?
I recommend you not make your dream language as your first project. Instead, create a small and simple language as your first step. Perhaps a FORTH derivative. Learn from that. See where that learning takes you. Do you want to make a general purpose language? or one to formalize legal contracts? or one where you demonstrate something creative and new?
Definitely learn at least one assembly language (ARM has less legacy cruft than x86) because this is the basis of the interface between the computer and the higher level language. Modern technology like LLVM allows you to have an abstraction of the computer, but it is good to have detailed knowledge of real hardware.
My start to learning a little about programming languages was this. I started from a computer engineering degree; so I had knowledge of hardware. I set my goal: I wanted to program a microcontroller using Python. I knew it was unlikely that I could fit the entire language and its libraries in the small memory of a late 1990s 8-bit microcontroller. First step was to get "Hello World" to work. I wrote that program in Python and "disassembled" the program into Python's bytecodes. From that I started to write a tiny interpreter supporting only the bytecodes I needed. I created the datatypes in C that I would need to hold the Python datatypes. I created execution frames (what a procedure becomes when it comes time to run it). I held off on ceating a memory management and garbage collection system until I absolutely needed it.
I do NOT recommend you learn an interpreted language like Python. I just wanted to share the thinking of: start small, get that working, see what's next.
Sure.
My plan indeed is to start small. Learn and implement as any languages as possible from the popularly recommended books and tutorials. Then, implement my own little toy programming language. Only then would I think of my dream language.
If I may take your time for one last thing regarding this; is experience in systems programming advisable before one embarks on compiler/parser engineering?
I don't know what that means, what is "systems programming" for you? An operating system? A device driver? The TCP/IP layer? If so, no, it's not required, it's a different topic. Comparable to a 3D engine, it's its own thing, unrelated to compilers.
Yes, those are the things I had in mind.
Thank you @Araq for taking out time and sharing your insights.
Harmful: OOP for compiler plus the visitor pattern.
Why?