A quick note for snappy users out there: we've recently completed a revamp of our snappy implementation, simplifying the API and making sure performance is decent.
The implementation covers both the plain and framing formats, and offers implementations for both in-memory and streaming - framing + streaming in particular allows keeping memory usage in check.
Compression and decompression is done either with user-supplied buffers (this API is completely dynamic-allocation-free), or via convenience functions:
import snappy
let
compressed = snappy.encode([byte 0, 1, 2, 3])
original = snappy.decode(compressed)
Performance-wise, there are benchmarks posted - generally on par with the faster implementations out there (https://github.com/status-im/nim-snappy#performance) - ie one could certainly do better with hand-written assembler like the go implementation does, but pure-nim is not bad either, even with the additional range-checking that Nim does.
The implementation was originally written by @jangko, and has since been audited, battle-tested and hardened via our (quite heavy) use of snappy throughout the Ethereum protocol space.
Have fun!
Someone asked about nlvm and the performance differences - in libraries like this, with CPU-heavy tight loops, the tricks that nlvm uses to generate more performant code for range checks and exception raising shines in particular: the nlvm-compiled benchmark of this library shows a 20% throughput increase for both compression and decompression without loss of safety / functionality.
Of course, one could write more messy nim code that removes the range checking and more tightly controls loop unrolling etc using more casts, pointers and other unsafe constructs, but that defeats the purpose of writing it in Nim to begin with - the role of an optimizing compiler after all is to contextualise the code and make sure it performs the best it can in the context it's used, eliding safety mechanisms only in a provable way.