Merry Christmas and a happy new year to all of you!
I would like to show you my transpiler. It’s under active development but already quite capable.
https://github.com/kobi2187/transpiler
Like a good israeli, I will start from the end ;-)
So the ideal product is to have a library from some input language ported automatically to Nim.
This transpiler will support only static typed languages that are high level, and fairly close to Nim.
I have XLang – it is an intermediate representation (IR) that is basically a tree of nodes, (written in Nim), it’s a large object variant. It is aimed to be a superset or a union of all the features and constructs of input languages.
So I separate it to a few phases:
It starts with a parser in the input language, that knows all the details of the AST, creates a json that maps to the xlang constructs.
Currently the only mature and tested input language parser is C# with some draft ones in the barrel.
The middle is more complicated.
When we have xlang json loaded to memory with no issues, we ask the output language Nim, which kinds of constructs it doesn’t know how to handle. Those kinds are a set of enum types.
Then we have a fixed point transformer. Generally the idea is that it will try to replace the offending nodes – translating them with equivalent code that Nim understands (they’re still xlang nodes at this point), this happens repeatedly until no offending nodes are found. (we query the tree repeatedly)
That part is more complicated because we need some semantic analysis, for example if we rename a function or a field to match nim conventions, we need to rename where that function is called from (the callsite) so we need the same symbol. Every symbol has a uuid.
Currently we can also find and assign the parents. But the analysis will perhaps need to grow more features if we need to.
The last phase is very simple: We convert xlang to Nim nodes. The nodes available were only for compile-time usage, for macros, so I “cloned” the structure to “my_nim_node.nim” so I can use it in runtime.
This is a single pass.
Lastly astprinter handles the nimast tree to textual code.
How to use it:
“transpiler/src/parsers/csharp/” path should have the c# parser code.
Cwd to that directory and: “dotnet build” or “dotnet build -c Release”
then you’ll have: bin/Debug (or Release)/net8.0/csharp-to-xlang
(maybe not net8.0 depends on installed dot net)
This binary can be run directly: ./csharp-to-xlang some/path/NAudio
for example. Just pass the directory as argument.
It will produce xljs files alongside the cs ones.
That’s it. Now the transpilation:
cwd to transpiler dir.
nim c main
./main -d some/path/NAudio will put all the generated nim files near the cs ones – good for testing and comparing. There are also -j (to produce the nimjs files – for testing if it has everything)
and -v (verbose)
./main some/path/NAudio will put all the files in “transpiler_output” folder, so you have all the nim files neatly organized.
I use the nim conventions so the output file names are different from input file names. I try to have it as idiomatic as possible, as long as that’s not an issue.
The transpiler is still under heavy development, but I did want to make a gift at this time. It can already help a lot and my first libraries to port are ICU4N (a native pure impl of icu for dotnet), and NAudio. They are being used as test samples, and native nim icu can be useful for typography, and may fit well in another project of mine (rui).
I hope you find it useful. Expanding the ecosystem with mature libs should now be easier.
Happy New Year, Kobi
Hello again.
In addition to C# input, we now have some basic support for Go language.
Please try and report what it couldn't handle. Go front is only half tested so far.
I didnt know about those type of language-translators. It is very cool if you can automatically translate one language into another. If further developed, it is sort of the holy grail of programming-langs. I assume source and target language must be sort of similar.
As you allready indicated:
This transpiler will support only static typed languages that are high level, and fairly close to Nim.
Maybe not something like C++ which has so many features (tho that would be cool too), but a language with limited constructs.
Are there other generic translators? Can you abstract the translated langs to a (pair of) definitions for each lang-set or is stuff still hard-coded?
Then create an AST of some kind out of THAT.
I would imagine, while probably expensive as hell for large libraries, you could probably map those language-specific features no problem to some kind of IR because it's operating on the functions themselves. Like... Claude knows what a c++ function is doing, and if it knows the data types too it can probably just map to idiomatic Nim code and skip IR entirely. Idk. Maybe.
I'm glad you appreciate it.
Yes, it's the proper thing, very generalized. nothing hardcoded, no need for language pairs.
You write the parser in the input language, and it should output a json with constructs and fields matching the kinds and fields from xlangtypes.nim (it's an object variant, XLangNode).
Recently I decided to add a syntax representation for these nodes as well, mainly for debugging, to see if the transformations are correct. (so now it outputs a .xlang file as well)
C++ and low level languages will not be supported. I think Rust can be supported. Go should work but needs polish. Java parser is being matured. C# has highest support since i know all (most?) its quirks and nuances. I also want python as an input language, but it should be python with types, in other words, after some automated tools added typing. Need to see this in practice, don't want to promise before i know issues involved.
Would love input from the community, which language interests you and which library you'd like to see ported. I'm trying to focus on languages with large ecosystems.
Yes, AI changes the landscape but it's non-deterministic. I think it's best to use AI to code deterministic tools, even very complicated things, very ambitious ones. It's a big enabler. If you can split it to tiny libs it can work well.
One very promising in my opinion is the extremely underutilized CSP Constraints Satisfaction field. These solvers are amazing, but they're kind of inaccessible. AI can make tools that use them as an engine. it would be both deterministic and very powerful.