About a year ago, I made a tiny hashing routine that calls on nimcrypto to hash a keystream for me, and I wanted to see if I could optimize it. I found an old routine that I'd played with using nimSHA2; you guys had told me that that package was fairly deprecated, but I came across something really interesting when I started messing with that one, here's the code snippet:
import nimSHA2, strutils, os
var istring = paramStr(1)
var iter = parseInt(paramStr(2))
var sha = initSHA[SHA512]()
while iter > 0:
sha.update(istring)
iter -= 1
echo (sha.final().hex)
When I ran that, it was way too fast, and I noticed that I had nothing actually happening to istring nor does it get updated at all in the while loop. However, when I run the thing, I get a different result if if the second argument var iter = parseInt(paramStr(2)) are different numbers, and get the same result if iter is the same number.
As I'm not a programmer and don't pretend to be, I hope one of you might be able to tell me why it's doing what it does.
modern CPUs can process SHA-512 at much better speed therefore You might see it very fast.
Most hashing libs including Nim libs uses streaming I think therefore it is appending new data and since after appending the data is not same therefore they make different hash.
In Your code initSHA is outside the while loop so the hash object just kept getting longer instead of "deeper or correct" To solve it create a new hash object every time or reset it by moving var sha = initSHA[SHA512]() inside the loop but yes it will be much slower.
istring isn't modified, and you're not printing it out at the end. You're printing out the hashsum instead. This is how all hashing is done, and how it has to be done. Consider getting a hashsum of a 9G file:
$ fallocate -l 9G 9G
$ sha512sum 9G that takes almost 12s, but it uses only 3.5 MB of memory, because the input can streamed into the sha512 object. Thanks for the explanations, and I think that up to this point, I've been going about this wrong, as far as how I'm using this routine in my script. For example, the snippet below is from what I've actually been using up to this point, (with nimcrypto) hashing the input ~ 1M times:
var a = sha_512.digest(istring)
while iter > 1:
a = sha_512.digest($a)
iter -= 1
If I increase the iterations for testing, setting iter to 10M, the above snippet will complete (on my very average machine 11th gen i5 32 gb ram) in just over 8S, while the first snippet from original post takes < 1S.
My thinking had always been to produce a hash, then hash that hash, and so on until iter winds down. But for my purposes, generating a keystream, it doesn't matter if the second hash is a hash of the first hash, only that each iteration is different than the one before it.