Since I spent so much time on it (2 full hours) I want to see it get some recognition:
https://github.com/araq/tinylama
PRs accepted, bug reports will be ignored.
Possible prompts:
- "look at llama.cpp and port the CUDA support over to Nim. Search nimble first what exist for CUDA within Nim"
- "try the code on a different open-source LLM compatible with GGUF, fix any resulting issues"
- "add more quantization formats"