Today I'm comparing string operations with Nim 1.2.1 vs Python. This test concatenates a letter to a string 1M times.
ms:nim jim$ cat str1a.nim
var
s: string
for i in 0..1_000_000:
s = s & 'x'
echo len(s)
ms:nim jim$ nim c -d:danger str1a
Hint: 14210 LOC; 0.565 sec; 16.02MiB peakmem; Dangerous Release build; proj: /Users/jim/nim/str1a; out: /Users/jim/nim/str1a [SuccessX]
ms:nim jim$ /usr/bin/time -l ./str1a
1000001
45.02 real 44.98 user 0.03 sys
48394240 maximum resident set size
11825 page reclaims
8 page faults
1 voluntary context switches
16 involuntary context switches
ms:nim jim$ cat str1a.py
s = ''
for i in xrange(1000000):
s = s + 'x'
print len(s)
ms:nim jim$ /usr/bin/time -l py str1a.py
1000000
0.22 real 0.21 user 0.00 sys
6078464 maximum resident set size
1686 page reclaims
11 involuntary context switches
I tried enclosing the Nim test with a proc. That did reduce RAM from 48.4M to 46M, but runtime was still 45s.Thanks, I tried your fusedAppend template but it didn't work (compiled, but didn't change anything). I think for whatever reason it isn't getting used because I added echo a as a 2nd line and nothing was displayed.
It may be true that no one uses s = s + t (I doubt that), but if a 2-line template can change this into s &= t, I'd suggest it's worth adding that to the compiler for a 250x speed increase. This was a very unexpected speed bump to me.
Here's a test comparing s.add('x') with s &= 'x'. It seems like these should have identical performance, but:
ms:nim jim$ cat str1.nim
var
s: string
for i in 0..100_000_000:
s.add('x')
echo len(s)
ms:nim jim$ /usr/bin/time -l ./str1
100000001
0.88 real 0.73 user 0.14 sys
440184832 maximum resident set size
107485 page reclaims
1 block output operations
3 involuntary context switches
ms:nim jim$ cat str1c.nim
proc main() =
var
s: string
for i in 0..100_000_000:
s &= 'x'
echo len(s)
main()
ms:nim jim$ nim c -d:danger str1c
Hint: 14213 LOC; 0.587 sec; 16.016MiB peakmem; Dangerous Release build; proj: /Users/jim/nim/str1c; out: /User\
s/jim/nim/str1c [SuccessX]
ms:nim jim$ /usr/bin/time -l ./str1c
100000001
0.54 real 0.42 user 0.11 sys
326619136 maximum resident set size
79751 page reclaims
8 page faults
1 voluntary context switches
5 involuntary context switches
var
s: string
proc `:=`(a:var string,b:string) = `=`(a,b)
template fused{`:=`(a,`&`(a,b))}(a:var string,b:string) = a.add(b)
#template fused2{`=`(a,`&`(a,b))}(a:var string,b:string) = a.add(b)
for _ in 0..1_000_000:
s := s & "x"
echo len(s)
this works, but i can't convince term-rewriting templates to ever touch an =
var
s: string
template fused2{a = `&`(a,b)}(a:var string,b:string) = a.add(b)
for _ in 0..1_000_000:
s = s & "x"
echo len(s)
that was the secret! dumpTree was my friend, needed to get the AST to match preciselyHey thanks, this is interesting, although it threw me at first because it ran in .01s and used 4MB of RAM. After scratching my head a bit I realized you were doing 1M loops instead of 100M. :-)
The template version ran in .69s on my machine, while str1c, which should be the same thing, runs in .54s. They both use around 326M of RAM. I was curious why yours would be different, so added a fused3 template that takes a char, then tried using for i instead of for _, but it still took longer. Then I remembered about wrapping it in a main() proc - DUH! - and the two are now the same.
My math earlier was wrong, but these 2 template lines do give an 80x performance increase. I think that would be worth adding to the compiler, just in case someone does write code like I did, is porting Python code to Nim, or has some other reason for preferring this notation.
Thanks again!
That was cool, so I thought I'd try the same thing with +:
ms:nim jim$ cat strfused.nim
proc main() =
var
s: string
template fused2{a = `&`(a,b)}(a:var string,b:string) = a.add(b)
template fused3{a = `&`(a,b)}(a:var string,b:char) = a.add(b)
template fused4{a = `+`(a,b)}(a:var string,b:string) = a.add(b)
template fused5{a = `+`(a,b)}(a:var string,b:char) = a.add(b)
for _ in 0..100_000_000:
s = s + 'x'
echo len(s)
main()
ms:nim jim$ nim c -d:danger strfused.nim
Hint: used config file '/Users/jim/nim-1.2.1/config/nim.cfg' [Conf]
Hint: system [Processing]
Hint: widestrs [Processing]
Hint: io [Processing]
Hint: strfused [Processing]
/Users/jim/nim/strfused.nim(9, 11) Error: type mismatch: got <string, char>
but expected one of:
proc `+`(x, y: float): float
first type mismatch at position: 1
required type for x: float
but expression 's' is of type: string
proc `+`(x, y: float32): float32
first type mismatch at position: 1
required type for x: float32
but expression 's' is of type: string
proc `+`(x, y: int): int
first type mismatch at position: 1
required type for x: int
but expression 's' is of type: string
proc `+`(x, y: int16): int16
first type mismatch at position: 1
required type for x: int16
but expression 's' is of type: string
proc `+`(x, y: int32): int32
first type mismatch at position: 1
required type for x: int32
but expression 's' is of type: string
proc `+`(x, y: int64): int64
first type mismatch at position: 1
required type for x: int64
but expression 's' is of type: string
proc `+`(x, y: int8): int8
first type mismatch at position: 1
required type for x: int8
but expression 's' is of type: string
proc `+`(x, y: uint): uint
first type mismatch at position: 1
required type for x: uint
but expression 's' is of type: string
proc `+`(x, y: uint16): uint16
first type mismatch at position: 1
required type for x: uint16
but expression 's' is of type: string
proc `+`(x, y: uint32): uint32
first type mismatch at position: 1
required type for x: uint32
but expression 's' is of type: string
proc `+`(x, y: uint64): uint64
first type mismatch at position: 1
required type for x: uint64
but expression 's' is of type: string
proc `+`(x, y: uint8): uint8
first type mismatch at position: 1
required type for x: uint8
but expression 's' is of type: string
proc `+`(x: float): float
first type mismatch at position: 1
required type for x: float
but expression 's' is of type: string
proc `+`(x: float32): float32
first type mismatch at position: 1
required type for x: float32
but expression 's' is of type: string
proc `+`(x: int): int
first type mismatch at position: 1
required type for x: int
but expression 's' is of type: string
proc `+`(x: int16): int16
first type mismatch at position: 1
required type for x: int16
but expression 's' is of type: string
proc `+`(x: int32): int32
first type mismatch at position: 1
required type for x: int32
but expression 's' is of type: string
proc `+`(x: int64): int64
first type mismatch at position: 1
required type for x: int64
but expression 's' is of type: string
proc `+`(x: int8): int8
first type mismatch at position: 1
required type for x: int8
but expression 's' is of type: string
proc `+`[T](x, y: set[T]): set[T]
first type mismatch at position: 1
required type for x: set[T]
but expression 's' is of type: string
expression: s + 'x'
ms:nim jim$
Seemed like a reasonable idea at the time...I see, has to be like this:
proc main() =
var
s: string
proc `+`(a:string|char, b:string|char): string =
a & b
template fused2{a = `&`(a,b)}(a:var string,b:string|char) = a.add(b)
template fused4{a = `+`(a,b)}(a:var string,b:string|char) = a.add(b)
echo 'a' + 'b'
echo 'a' + "b"
echo "a" + 'b'
echo "a" + "b"
for _ in 0..50_000_000:
s = s + 'x'
s = s + "x"
echo len(s)
main()