Lately package management comes up in all corners of the community.
What i personally want from a package manager:
It must install packages, and install the newest freshest. So when i type pk install mypkg
I must be able to just do:
import mypkg
nim c -r myscript.nim
It must be able to install a package from a local directory (aka dev mode)
When i list the packages it must tell me:
mypkg version [devpath: path]
It must be super clear where a package came from, so where its loaded from, why its loded from that location etc.
It must be fast and it must inform the user what its doing. Currently for example nimble just "hangs" and does something and it takes long, and i have no idea what it does (and this annoys me).
And it must stay out of my way when hacking or writing simple throw away scripts (which i do very often), so when a package was installed with the package manager, it must be found by the compiler!
So this MUST work:
nimble install mypkg
# myscript.nim
import mypkg
nim c -r /some/folder/myscript.nim
AND uninstalling a package must uninstall it, the current situation with nimble is, i have ~3-5 different version of a lib some are ancient and in the pkg or pkg2 folder. Then i compile stuff, then it chooses the wrong installed lib. Or who knows what it does. Then uninstalling does not work, since some other libs depend on it. then i get angry and try to mess with nimble internals. This is not good.
Little (packaging/dependency) problems like this initially drove me away from python years ago.
And nowadays every time i need to interact with nim's packaging system i get a bad gutt feeling, which is not good. And it got a little worse over the years.
I think the tooling we have is not bad per se i think some small polishing could do wonders here:
- Tune the package manager for the most common case first!
- Most people do not need tagged/locked dependencies, or such things.
- They write simple scripts with simple modules and this use case must work -> flawlessly <- .
- They put their modules in a project folder, and want to include stuff from their project folder, while developing it. This use case must work -> flawlessly <- .
- Just look a the nimbles github issue front page, and you see the issue: "...Nimble develop...", "...nimble get stuck...", "develop", "not installing globally", again "develop" etc.
- Make it possible to easily use advanced mechanisms, like taggin etc pp.
- But keep those things for advanced users.
Uh oh ... now you got me started about package management. As somebody who tries to keep multiple packages maintained everything that is more than "I pushed to master" is a liability. I love Nimby's ideas here and with some refinement of its ideas it would nail package management. It's the one "I know it when I see it" solution I've been looking for quite some time.
I think the system can scale to much larger ecosystems once it grows an override for HEAD. Here's the key insight: when packages have conflicting requirements for the same dependency, use the dependency chain's depth to resolve conflicts. A package could demand a specific commit of a dependency, and when these requirements conflict, prefer the requirement from the dependency that is earlier (closer to the root) in the dependency chain. No SAT solver is required and arguably the resulting system is much easier to understand:
The rule is: proximity to root in the dependency graph determines priority, independent of declaration order.
YourApp (depth 0)
├── A (depth 1) → wants D@commit-x
│ └── B (depth 2) → wants D@commit-y
│ └── C (depth 3)
A's requirement for D wins over B's, because A is closer to the root. A chose to depend on B, so A's constraints take precedence over B's.
You can also directly list an otherwise indirect dependency in your dependency file to override what would otherwise be chosen by the depth rule. This lets you pick a specific commit (or version) that would have been overwritten by a deeper dependency's requirement. Instead of commits, versions can be used too as they are easier on the eyes, but these get resolved to commits much like Atlas/Nimble do today.
No special [overrides] section. No resolutions field. No patch mechanism. Just: list it yourself if you care.
This also means lock files are just "every transitive dep promoted to depth 0" - a snapshot of the fully resolved graph as direct dependencies.
The dependency depth can also be used to create a "natural" directory structure that keeps things inspectable:
workspace/
project A # depth 0: project you work on directly
project B # depth 0: project you work on indirectly
deps/
deps of project A # depth 1: direct dependencies of project A
deps of project B # depth 1: direct dependencies of project B
deps/
deps of deps of project A/B # depth 2: transitive dependencies of project A/B
Want to move a project into the workspace/ directory directly? List it as an explicit dependency in your .nimble file, which promotes it to depth 0.
The algorithm for the initial clone and an update is identical: a breadth-first traversal (processing dependencies level by level) of the dependency graph:
proc resolve(root: string) =
var seen = newHashSet[string]()
var queue = @[(loadNimble(root), 0)]
while queue.len > 0:
let (pkg, d) = queue.pop()
for dep in pkg.dependencies:
if not seen.containsOrIncl(dep.name):
pullOrClone(dep.name, dep.url, dep.commit, d + 1)
queue.add((dep, d + 1))
It's always a good sign if the design simplicity translates into a simple algorithm.
This way dependencies can easily use multiple versions at the same time.
Can you elaborate? Suppose App -> (A, B), A -> C(v1), B -> C(v2). If C isn't a direct dependency of App (so A and B don't interact via common types from C), I guess it is safe to have to versions of C at the same time. But how will this work with path resolution during imports in A and B? C(v1) and C(v2) still share the same name.
Well C's HEAD is used or a specific commit according to the rules that I outlined. Regardless of what is checked out, let's assume C made a breaking change so it ended up with this directory layout:
src/
foobar.nim
v2/foobar.nim
foobar.nim then contains:
{.deprecated: "use v2/foobar instead".}
import v2 / foobar
var globalState = setupState()
proc oldApi() = api(globalState)
And things keep working. The versioning is now in the Nim code, it's not in the git history.
But everything I described here is entirely optional from the rest of my proposal! You don't have to like this part! ;-)
The rule is: proximity to root in the dependency graph determines priority, independent of declaration order.
Unfortunately that fails pretty quickly. It's pretty common to require the same dep at the same level:
YourApp (depth 0) ├── A (depth 1) → wants D < 2.7 ├── B (depth 1) → wants D > 3.0
No SAT solver is required and arguably the resulting system is much easier to understand:
The general consensus in other languages has been to move towards tools with SAT. For example uv is super popular in Python now. Cargo uses it now and generally Cargo works very well.
I've used uv some lately and it's way better than previous Python tools which didn't use SAT. It seems after decades of saying they didn't need deterministic SAT solvers the Python world has moved full onto uv. Unlike pip which used greedy solutions with fallbacks and backtraces which isn't too different than the proposed algorithm. It's simple but breaks just like Nimble did.
What would be nice for Atlas (and Nimble) IMHO would be to implement or use PubGrub which is based on a paper that allows clear error message resolution with SAT solver.
Unfortunately the SAT in both Atlas and Nimble don't give good failure messages and just spin.
Outside specific conflicts or requirements SAT just selects the latest versions of packages, so eh?
What i personally want from a package manager:
Matches what I generally want and Atlas has been working well for me. Just be sure to install the latest Atlas! It's cool to see other ideas and PM's but they all seem built for a specific authors use case.
It must install packages, and install the newest freshest. So when i type pk install mypkg
Yep it'll use the latest. Unless there's constraint which are pretty rare with Nimble packages. Only Status deps are complex enough to really run into that.
I must be able to just do: import mypkg nim c -r myscript.nim
Exactly! That's why I don't use Nimble anymore. Unfortunately it's hard to predict where packages will come from. The newer Nimble releases are much better, yet somehow I still seem to get weird packages.
Atlas sets the nim.cfg and I run my nim c ... commands. I also put tasks in my config.nims so I can do nim test or whatnot.
It must be able to install a package from a local directory (aka dev mode)
Atlas supports "linking", e.g. atlas link ../mydeps/. It creates a deps/mydeps.nimble-link file which shows where it's coming from.
When i list the packages it must tell me: mypkg version [devpath: path]
Atlas does show the version, but not the location. Though it'd be easy to add. However you can run a cat nim.cfg to see locations. This works for newer Nimble with nimble.paths.
It must be super clear where a package came from, so where its loaded from, why its loded from that location etc.
That's a bit lacking. See my previous comment. Generally Atlas show which version it selected with the nearest options and whether it selected head or not using ^ next to the version and git hash.
However something like the PubGrub algorithm would be super awesome for those (rare) times conflicts happen.
IMHO, Atlas does need a bit more polish on things like deleting repos. There's currently not a command for that. Instead I just delete the deps/ folder or the package in deps/foo. The documentation could use polishing a bit, etc. It'd be nice to have an atlas install --update to update and install in one go.
Unfortunately that fails pretty quickly. It's pretty common to require the same dep at the same level: ...
Yeah, well, I don't care about the complex solutions that make semver somewhat work when semver itself is annoying crap.
work when semver itself is annoying crap:
Who's talking about SemVer? :P
Most projects tend to use just a form of ZeroVer. I'd agree SemVer is hard and rarely done well. We all know that Knuth's Tex versioning scheme is approaching infinite perfection.
Numeric versions are a handy human friendly way to communicate progress that indicates some effort of verification / testing on the package devs side. They're simple and monotonic so it's easy for devs to see "version 50" > "version 33".
For example I recently updated my ChatGPT Codex in my package manager. It was easy to see that my version of 0.40 was pretty far behind the current 0.60 release.
Yeah, well, I don't care about the complex solutions that make semver somewhat
Irrespective of SemVer or even a numeric scheme some deps need > #abc123 for some feature while others requires < #abc222 where ##abc123 is before #abc222. BTW supporting > #abc111 would be nice in Atlas.
Three numbers are not enough information to convey the complexity of the software so you're better off reading changelogs, try an update and run the test-suite. And when you do that (and you have to do that) you might as well ignore the version numbers.
Sure numbers are a lossy mechanism, but good enough for many cases. Besides who get's excited about trying out Nim ab00c56904e3126ad826bb520d243513a139436a! :P
From my understanding, the main feature missing from what Rust/Cargo offers (as what is described above) is creating and/or importing a dynlib for a package (aka Nim module) defined within the mono-repo. i.e. you can manually maintain the --path's in nim.cfg/config.nims files, but this requires static linking of the module (& re-compilation?).
Creating a dylib with Nim symbols is limited, as to my knowledge {.dynlib.} requires importc/exportc, e.g. some features such as proc overloading doesn't work - I don't know how to overcome this limitation. I guess the API between two "packages" just needs to restrict features the API boundary uses?
Here's some code to demonstrate: https://github.com/miguelmartin75/nimdylib/
If you ignore backwards compatibility and juts wing it with the parsing, the value of a cache goes down significantly since you can simply read the nimble file straight from git...
This is what I'm doing with Percy. Over the past couple days I've been using the everything repository to find and handle various edge cases for support with existing Nimble files. It's actually _relatively rare that people are using dynamic or variable data. I don't have precise stats, but there were only few repos that were wholly rejected at present on this basis, and a handful of excluded specific commits in others.
Far more common are actually just completely missing/down repositories, non-existent commit hashes, and ill-defined constraints.
If you _solely rely on git as a source of truth and introduce point of install name resolution, it drops even further as some of these examples are in the name and version settings, which I've dropped entirely from the NimbleFileInfo. And even in that case, there was only one package which seemed to break entirely. The handful of cases that remain are things like: bin = @[pkgName] and one where the binDir was defined variably, which just seems weird.
IMO, it's preferable to break these things if you're retaining a developer focus (as opposed to a consumer focus). Forcing constraints on a developers which lead to overall ecosystem improvements even if they break backwards compatibility can be preferable.
Taking "https://github.com/nim-lang/threading" as an example which, in Percy, currently breaks for any numeric version constraints as it has no defined tags and appears to rely solely on the nimble file. It's better, in my opinion, break it an encourage tagging than to continue to rely on an in file version number, which as you've pointed out, can and generally does persist for multiple commits and could, in some instances, be almost entirely inaccurate depending on resolution strategy.
I can recognize those "reasons," but it's rather difficult to agree with them (particularly in the context of a package manager). There are downsides to monopolies which are obvious in economics, let alone software development.
As your first point actually seems to identify, faciliating "communication" via something like a monorepo seems more of a euphemism for "avoiding communication" (e.g. we don't want to independently document things very well or maintain clear versioning for certain submodules cause it's expensive, so we'd rather just force breakage and tightly couple them to our larger concern). This is its own form of technical debt, which like all technical debt, can be useful or practial in a short term, but generally leads to larger issues down the line.
Regarding bullet point #2, again, seems like an excuse avoid things, e.g. setting up flexible and capable CI integration testing. If a PM is functional and does its job correctly then then PR which relies on another PR elsewhere should be implicit in the first PR, e.g. committing a lock file that defines the dependency.
Regarding #3 -- see capable integration testing.
Regarding #4 -- I don't see a problem with different teams using different standards that work for them.
If your team is that large that you have multiple sub-teams maintaining different pieces, I'd argue you should actually respect team boundaries and fix the communication between them rather than providing structures to work around perceived inefficiencies (which are actually, IMO, things which have their own purpose).
What confuses me about the monorepo approach in general, but more explicitly in the context of a PM (and certainly a PM disucssion) is that if what you want is a monolith, you can just build an actual monolith. If what you want is a collection of nicely integrated libraries that are functional in a larger framework but also work for independent parties by providing modularity and pluggability, then yes, it's going to be more expensive but you should maintain it as such.
I'm all for wanting to have cake and wanting to eat it too, but I've been in technology long enough where I just see the same cycles repeating in different forms. Cost benefit analysis over a long period seems to produce these iterative cycles and "trends" where people start doing one thing in order to solve issues in the way they used to do it. People do that for awhile, then start to see the same issues the old way was designed to solve, then they recreate the old way just in a slightly different form.
At some point, you get to a stage where you're at core contradictions and rather than repeating the cycle, you should just pick a side. Monorepos feel like people who grew up in an era where monoliths were looked down upon (for good reasons), and then though they created a monolith by pretending it was just a collection of smaller pieces (but not actually treating it as such), they'd somehow avoid those problems. The inverse is also true, of course. And it's not that one solution is "better than another" (although I have my preference and context matters), but what seems certainly better is actually just choosing the more appropriate solution rather than trying to find some unicorn "third way."
Not sure where this is coming from - it was indeed tested and found wanting,
I don't know what "testing" you're referring to but it's been working fine in Atlas for half a year now. No one has complained or said it broke anything in Atlas. Seems to make things just a bit more stable with Atlas.
Meanwhile Nimble is still slow, still doesn't know versions aside from whatever random #head happens to be for many deps, etc.
Nimble has gotten better in develop mode but generally it's still bad - hence this thread.
ie if you follow the history of the nimble file, in many repositories the tag points to some other commit and legitimately so
Again solved already: git tags take precedent. Life happens and Atlas just prints a warning if the versions don't match. Yep, that's it. The mismatch happens even without using this method.
you might have wanted things to work this way or looked at a small sample size, but reality is unfortunately different.
I don't want it to work that way. It does work that way for enough Nimble packages to be valuable.
Your examples are all theoretical edge cases and haven't caused issues. At least none folks have reported for Atlas.
If you ignore backwards compatibility and juts wing it with the parsing, the value of a cache goes down significantly since you can simply read the nimble file straight from git - a cache is probably still useful (to avoid the many small git operations), but much less so.
Again, been working fine in Atlas which is significantly faster than Nimble with or without caching.
Failures in this case are mostly not "it subtly breaks stuff" it's a "oh this version doesn't compile let me fix the missing dep". The actual builds still use all the fancy scripting edge cases.
Taking "https://github.com/nim-lang/threading" as an example which, in Percy, currently breaks for any numeric version constraints as it has no defined tags and appears to rely solely on the nimble file.
Used to be Nimble would use the Nimble file version and whatever git commit it was at so that you'd have multiple versions of threading library at 0.2.1. Hopefully nimble develop doesn't do that anymore, but does it still do that for regular nimble install? That's the behavior none of us want.
Atlas's method it's stable and chooses the last changed Nimble version until a dev explicitly decides to change the Nimble version. The git commit and version number are stable unless overridden by git tags.
It's better, in my opinion, break it an encourage tagging than to continue to rely on an in file version number, which as you've pointed out, can and generally does persist for multiple commits and could, in some instances, be almost entirely inaccurate depending on resolution strategy.
That'd be nice, but instead Nimble chooses some undefined behavior.
At least ignoring the Nimble version altogether would at least more consistent than that. Or refusing to work with the package at all.
At least ignoring the Nimble version altogether would at least more consistent than that. Or refusing to work with the package at all.
Percy will still work for things like HEAD, branches, and commit hashes. But this, https://github.com/elcritch/sigils/blob/main/sigils.nimble#L9, will not work until nim-lang/threading actually tags a version (in the git repo) that is >= to that version -- and I'm OK with that, it's just a matter of trying to inform the user as to why they can't install which is an over time UX improvement.
From there, a limited number of minimal constraints in a popular/widely adopted tool that does a good job of reporting issues and informing people how to fix them can go a long way to providing more consistency and an overall better ecosystem.