Here's something that was idling in my code attic for quite a while and finally got pushed past the "is it useful yet" line this weekend: a source code formatter for Nim!
Apart from a working package manager, I can't really think of a tool that I've missed more than this on a daily basis from other languages, so without further ado: https://github.com/arnetheduck/nph/.
I intend to never ever format Nim code manually again - it is an utterly useless waste of time
nph takes the opinionated route of discarding existing formatting and normalizing all code to a single, beautiful style which loosely is based on already-existing formatters for other languages such as `black` and `prettier`.
The formatter parses the input using a somewhat customized parser based on the one that ships with the compiler, creates a somewhat customized AST that retains a bit more information about the original source code than a typical AST, then writes it back best it can.
Before writing the changes to disk, the original compiler parser is called on both the unformatted and formatted code, comparing the two AST:s for semantic differences - if there is any difference, the original code is left untouched thus fulfilling the first rule of formatters: thou shalt not break working code.
If everything goes well, the formatted code is written to disk and you can get on with your life :)
To get an idea what the format looks like, here's a typical proc definition - everything fits on one line, nice!:
proc covers(entry: AttestationEntry; bits: CommitteeValidatorsBits): bool =
...
If we add more arguments, it starts getting long - nph will try with a version where the arguments sit on a line of their own:
proc addAttestation(
entry: var AttestationEntry; attestation: Attestation; signature: CookedSig
): bool =
...
The return type was given its own line which makes it easy to pick out as the function definition grows - it stays aligned with proc however so you can find arguments, return type and actual implementation at a glance.
Some procedures take even more space - specially when using descriptive names, libraries like `Result` or macros like`async` - such a signature provides a lot of useful information to both the compiler and the programmer and nph will break it down for you to make it easy to grok:
proc validateBlsToExecutionChange*(
pool: ValidatorChangePool;
batchCrypto: ref BatchCrypto;
signed_address_change: SignedBLSToExecutionChange;
wallEpoch: Epoch
): Future[Result[void, ValidationError]] {.async.} =
...
The above idea extends to most formatting: if something is simple, format it in a simple way - if not, use a bit of style to break down what's going on into more easily consumable pieces - here's a function call:
let res =
check_bls_to_execution_change(
pool.dag.cfg.genesisFork,
forkyState.data,
signed_address_change,
{skipBlsValidation}
)
nph is formatted with itself and comes with a bunch of before/after examples used for testing, which certainly could be extended. Some of the tests cover cases where there's still room for improvement (of which there is plenty) - comments in particular are tricky and might get moved to slightly unexpected places if they weren't placed in a location recognised by the nph heuristics.
Here's the Nim compiler, formatted with nph: https://github.com/arnetheduck/Nim/commit/75bdacdd4a3f829c0f1fe149673c2afd6add55a2
The README has a section on frequently asked questions that nobody yet has asked as well as all information you could possibly want in terms of how to run it - have a look and let me know what you think!
P.S. the name nph is a last-minute change after yardanico alerted me that nimph, the original name I had used, was taken and by nothing less than a package manager! I knew I should have published it when I started - now, I'm taking naming suggestions.
First up: Fire! I intend to basically immediately use that on the package I'm currently working on since I'm particularly curious how it will deal with macros etc. Particularly long lines because of long variables names or the like are stuff I tend to be guilty of.
Second up: For name suggestions, here a list of what flew through my mind:
now, I'm taking naming suggestions
IMHO "nimphomaniac" or "nimpho" or "nimfo" sounds good and are easy to remember as "nim formatter."
This is cool!
I don’t think nph is a bad name, but since you asked for suggestions and you took inspiration from black, what about “yellow” (as in nim’s crown color). Another option related to prettier could be “nimmier”.
Very cool!
Would auto variable renaming like what nimfmt has be considered semantic refactoring? Think a feature like that would stop people complaining about style insensitivity and also allow quick fixing of StyleCheck hints
I'd suggest nimfmt, but it's a bit boring and maybe too official-sounding. nimfo might get my vote :-P
Regarding the opinionated-ness, I see some occasional formatting in the compiler example that comes across as weird: parameters separated by ;, short import statements like import std/assertions split across two lines. If you're interested in feedback, maybe those two are not desired...
installation via nimble
https://github.com/nim-lang/nimble/issues/1166
It works if you clone and run nimble locally like the instructions say
parameters separated by ;
https://github.com/arnetheduck/nph#whats-with-the-semicolons
If you're interested in feedback,
https://github.com/arnetheduck/nph#do-you-accept-style-suggestions-and-changes
Would auto variable renaming
yes, it needs access to type and other semantic information in order to rename the right variable - the relevant style rule is that usages should match the declaration but different declarations can be .. different.
Awesome!
I recently filled out the Nim survey, and one of the top things I found missing from the Nim tooling ecosystem is a good formatter (like black for python). Really looking forward to using this.
If anyone has set this up in VS Code, let us know how. Would we need to update saem's extension to support nph?
I know some people have expressed different preferences, but just to illustrate another perspective, I think at the module level it's preferable to indent what's after import, var, const, type on a different line, even if it the whole structure could be formatted as a single line, that's because I interpret the type, const, etc at the module level as: "Hey, I'm declaring a new module section of vars|types|consts here", and not "I'm declaring a type here, a var there, etc". Again, I'm taking about the module level here.
I think this is prettier and easier to parse visually:
import
std/strformat,
pointmath
type
Point = distinct int
var
x = 1.Point
y = 2.Point
const
prettyFormat = true
proc `+`(x: Point; y: Point): Point {.borrow.}
proc `$`(x: Point): string {.borrow.}
echo &"Add two points: {1 + 2}"
than this:
import std/strformat, pointmath
type Point = distinct int
var
x = 1.Point
y = 2.Point
const prettyFormat = true
proc `+`(x: Point; y: Point): Point {.borrow.}
proc `$`(x: Point): string {.borrow.}
echo &"Add two points: {1 + 2}"
Thoughts?
I agree, except I like having imports completely separate for easy diffing (trailing commae are not supported for imports) and to have my nvim-summon tool work properly.
import ./pointmath
import std/strformat
I think in this case, it would be cool if the tool would be smart enough to reason: "Ha, I'm already seeing two different import lines here as opposed to being comma separated, maybe this user wants it this way, therefore I won't change it".
No matter how one slices it, there's no single style which can accomodate every scenario, therefore the best we could do to accomodate more situations is to have responsive/dynamic styling, which follows some general guiding principles.
nph does not re-group imports - if you have an import keyword per module, that stays.
nph merely splits a list of modules for a single import across one or more lines similar to the function parameter example in the initial post:
See https://github.com/status-im/nimbus-eth2/commit/4f8947f01e79ff6c7fc2d8c177cea80558a85d6e#diff-2da3b9fdbda727ce1436ef3bb3869eebb5b035f0ba8ee1a21d1067cba77c3d72R10 for a long-form example.
All lists-of-stuff work the same.
Does it work like cargo fmt for imports?
If they are in adjacent lines, they are sorted alphabetically. If there is a blank line in-between, it creates a separate group for sorting?
I use this in Rust projects to differentiate between crate imports, std/core imports and dependencies import.