Hi, I am working through Nim track in exercism. The task to solve is: "Given a number, find the sum of all the unique multiples of particular numbers up to but not including that number." I am curious about the performance of HashSet. Here is the summary of running times for a range of input numbers; the size of the HashSet[int] was approx 7_000_000 items. I compiled the code (see below) using Nim Compiler Version 1.2.0 [Linux: amd64] with -d:release and -d:danger flags. I expected O(log N) performance, but this is not what happened:
CPU Time [8350000] 0.531s
CPU Time [8400000] 0.444s
CPU Time [8450000] 0.795s
CPU Time [8500000] 1.828s
CPU Time [8550000] 3.533s
CPU Time [8600000] 5.789s
CPU Time [8650000] 8.634s
CPU Time [8700000] 12.119s
CPU Time [8750000] 16.176s
CPU Time [8800000] 20.857s
CPU Time [8850000] 26.108s
CPU Time [8900000] 32.016s
CPU Time [8950000] 38.528s
CPU Time [9000000] 45.560s
What is the reason for the decreasing performance?
## Given a number, find the sum of all the unique(!) multiples of particular numbers up to but not including that number.
## If we list all the natural numbers below 20 that are multiples of 3 or 5, we get 3, 5, 6, 9, 10, 12, 15, and 18.
## The sum of these multiples is 78.
import sets, intsets
import times,strutils
proc sumA*(x:int, numbers:seq[int]):int =
var multiples:HashSet[int]
#var multiples:IntSet
for n in numbers:
if n == 0:
continue
for m in countup(n, x-1, n):
if not multiples.containsOrIncl(m):
result += m
template benchmark(benchmarkName: string, code: untyped) =
block:
let t0 = epochTime()
code
let elapsed = epochTime() - t0
let elapsedStr = elapsed.formatFloat(format = ffDecimal, precision = 3)
echo "CPU Time [", benchmarkName, "] ", elapsedStr, "s"
when isMainModule:
for x in countup(8_350_000, 9_000_000, 50000):
benchmark $x:
discard sumA(x, @[2, 3, 5, 7, 11])
Similar test with Python 3.6 code yields more reasonable running times:
8350000 cpu= 1.9215092658996582
8400000 cpu= 1.8640367984771729
8450000 cpu= 1.8687024116516113
8500000 cpu= 1.867283821105957
8550000 cpu= 1.8652291297912598
8600000 cpu= 1.9187562465667725
8650000 cpu= 1.8985896110534668
8700000 cpu= 1.8902392387390137
8750000 cpu= 1.8779652118682861
8800000 cpu= 1.940896987915039
8850000 cpu= 1.9335484504699707
8900000 cpu= 1.9375865459442139
8950000 cpu= 1.9178028106689453
9000000 cpu= 1.9903161525726318
Here is the Python code
import time
def sum(x, numbers):
collected = set()
result = 0
for n in numbers:
if n == 0:
continue
for m in range(n, x-1, n):
if not m in collected:
result += m
collected.add(m)
return result
for x in range(8350000, 9050000, 50000):
t0 = time.time()
s= sum(x, [2,3,5,7,11])
t1 = time.time()
print(f"{x} cpu= {t1-t0}")
You seem to be using an older version of Nim which has as a default the identity hash function (hash(x) = x) which leads, in this case, to poor performance. Just add this code (after your imports and before sumA):
import hashes, bitops
proc hash*(x: int): Hash {.inline.} =
Hash(rotateLeftBits(uint64(x) * 15241094284759029579'u64, 27))
Alternatively, you could update Nim-1.2 or the head of Nim-devel. Also, that hash is just one that works. There are several other integer hashes available at: https://github.com/c-blake/adix in the file althash.nim.Cool. Glad you got it working.
That multiply-rotate hash in my last post does yield a noticeable boost. I get Python-3.8.2 times of 1.255..1.337, Nim-1.3.3 with default hash times of 0.391..0.531, and Nim-1.3.3 with that multiply-rotate of 0.205..0.347. So, with a fast, good enough hash Nim is over 6x faster than Python here (1.255/.205 = 6.12x).
Using a faster hash yielding 0.391/0.205 = 1.9x speed-up just Nim-to-Nim is unusual, though not shocking. Usually, more work per-lookup is happening that masks the cost of the hash function calculation. The new default hash is "safer" against many hostile situations, such as this one you stumbled upon in your first post. One pays a little for that safety/generality, but with Nim it's easy to claw back that performance whenever it matters. Just 3 lines of code in this case.