I played around with the Sieve of Erastosthenes again... First I converted the initial C++ code to a more Nim friendly one, then I used c2nim to generate Nim code, which I fixed manually a bit. So C++ code and Nim code look very similar, that shape is basically conserved in the nimcache file, but contains many GOTO statements...
Of course the Nim code is very fast. But the C++ programs are 2.3 times faster (tested with gcc 4.9.3 and clang 3.5). I wonder why. For my first handmade Nim version yesterday I used an Iteraror, so my guess was that that one was the reason. So I removed it, but it made no difference. (Which is good to know, no reason to avoid iterators for best performance.) But I have still not really an idea about the performance difference. May it be the GOTOs? I think GOTO is not very common in C, so maybe gcc and clang can not optimize it that well? And the final question: In the rare case, where such a loop would dominate the performance of a large Nim program: How may we improve it?
# nim c -d:release sieve.nim
# Nim Compiler Version 0.11.3 (2015-08-11) [Linux: amd64]
proc main =
const n = 10000000 # 1e7
var primes = newSeq[bool](n)
for x in primes.mitems: x = true
primes[0] = false
primes[1] = false
var step: cint = 0
while true:
inc(step, 1)
while not primes[step]: inc(step, 1)
var startval: cint = step * step
if startval >= n: break
var i: cint = startval
while i < n:
primes[i] = false
inc(i, step)
var i: cint = 0
var c: cint = 0
while i < n:
if primes[i]: inc(c, 1)
inc(i, 1)
echo "Primes found: ", c
# i = 0
# while i < n:
# if primes[i]:
# echo i
# inc(i)
main()
#include <iostream> #include <vector> int main() { const int n = 10000000; // 1e7 std::vector<bool> primes(n, true); primes[0] = primes[1] = false; int step = 0; while(true) { step += 1; while(!primes[step]) step += 1; int startval = step * step; if(startval >= n) break; int i=startval; while (i<n) { primes[i] = false; i += step; } } int i=0; int c=0; while (i<n) { if(primes[i]) c+=1; i+=1; } std::cout << "Primes found: " << c << '\n'; /* for(int i=0; i<n; ++i) { if(primes[i]) std::cout << i << '\n'; }*/ return 0; }
#include <iostream> #include <vector> int main() { const int n = 10000000; // 1e7 std::vector<bool> primes(n, true); primes[0] = primes[1] = false; int step = 0; while(true) { while(!primes[++step]); int startval = step * step; if(startval >= n) break; for(int i=startval; i<n; i+=step) primes[i] = false; } int i=0; int c=0; while (i<n) { if(primes[i]) c+=1; i+=1; } std::cout << "Primes found: " << c << '\n'; /* for(int i=0; i<n; ++i) { if(primes[i]) std::cout << i << '\n'; }*/ return 0; }
I would assume it's because of vector<bool>'s special behaviour: http://www.cplusplus.com/reference/vector/vector-bool/
I had the same performance with C and Nim for the sieve of eratosthenes here (slide 11): http://felsin9.de/nnis/nimrod/nimrod-gpn14.pdf
it's because of vector<bool>s special behaviour:
That is really an interesting idea. I would assume that saving space/ram is possible with a special vector type for boolean, but have not considered that a performance increase is also possible. Maybe more compact data gives better cache performance. Will test soon, thanks.
[EDIT]
Indeed, when I replace bool with char in the C++ code, then speed is exactly the same as of Nim program. :-)
const
D = if sizeof(int) == 8: 6 else: 5
M = (1 shl D) - 1
type BoolSeq = distinct seq[int]
template `[]`(b: BoolSeq, i: int): bool =
template s: auto = seq[int](b)
(s[i shr D] and (1 shl (i and M))) != 0
template `[]=`(b: var BoolSeq, i: int, flag: bool) =
template s: auto = seq[int](b)
if flag:
s[i shr D] = s[i shr D] or (1 shl (i and M))
else:
s[i shr D] = s[i shr D] and not (1 shl (i and M))
proc newBoolSeq(n: int): BoolSeq =
BoolSeq(newSeq[int]((n+M) shr D))
proc setAll(b: var BoolSeq) =
template s: auto = seq[int](b)
for i in 0..len(s):
s[i] = (not 0)
proc main =
const n = 10000000 # 1e7
var primes = newBoolSeq(n)
setAll(primes)
primes[0] = false
primes[1] = false
var step: int = 0
while true:
inc(step, 1)
while not primes[step]: inc(step, 1)
var startval: int = step * step
if startval >= n: break
var i: int = startval
while i < n:
primes[i] = false
inc(i, step)
var i: int = 0
var c: int = 0
while i < n:
if primes[i]: inc(c, 1)
inc(i, 1)
echo "Primes found: ", c
main()
I'm experimenting with how to embed source-code into the Nim forum, since Jehan's code is now unreadable, lacking indentation. (And that shows the utility of curly braces, by the way.)
const D = if sizeof(int) == 8: 6 else: 5
M = (1 shl D) - 1
type BoolSeq = distinct seq[int]
template `[]`(b: BoolSeq, i: int): bool =
template s:
auto = seq[int](b)
(s[i shr D] and (1 shl (i and M))) != 0
template `[]=`(b: var BoolSeq, i: int, flag: bool) =
template s:
auto = seq[int](b)
if flag:
s[i shr D] = s[i shr D] or (1 shl (i and M)) else: s[i shr D] = s[i shr D] and not (1 shl (i and M))
proc newBoolSeq(n: int): BoolSeq =
BoolSeq(newSeq[int]((n+M) shr D))
proc setAll(b: var BoolSeq) =
template s:
auto = seq[int](b)
for i in 0..len(s):
s[i] = (not 0)
proc main =
const n = 10000000 # 1e7
var primes = newBoolSeq(n)
setAll(primes)
primes[0] = false
primes[1] = false
var step: int = 0
while true:
inc(step, 1)
while not primes[step]:
inc(step, 1)
var startval: int = step * step
if startval >= n: break
var i: int = startval
while i < n:
primes[i] = false
inc(i, step)
var i: int = 0
var c: int = 0
while i < n:
if primes[i]:
inc(c, 1)
inc(i, 1)
echo "Primes found: ", c
main()
Ok. I see now why Jehan's old code is no longer indented properly. If you start a comment with a code-block, it's no longer indented. Maybe that's a fixable bug in the website code?And that shows the utility of curly braces, by the way.
Yes, and terrible tools that remove curly braces but don't touch the whitespace show the utility of semantic whitespace.
Maybe that's a fixable bug in the website code?
Hopefully. :-)