I want to implement a pythonic thing:
def myiter(anotheriter):
for i in range(0, 3):
for j in anotheriter:
yield j
d = {1: 4, 2: 5, 3: 6}
for i in myiter(d.values()):
print(i)
for i in myiter(d.keys()):
print(i)
It there a way to do the same in Nim?
import tables
iterator myiter(anotheriter: iterator): int =
for i in 0..<3:
for j in anotheriter:
yield j
let d = {1: 4, 2: 5, 3: 6}.toTable
for i in myiter(d.values()): # test.nim(8, 18) Error: attempting to call undeclared routine: 'values'
echo(i)
for i in myiter(d.keys()):
echo(i)
Yes, use a .closure iterator.
http://nim-lang.org/docs/manual.html#iterators-and-the-for-statement-first-class-iterators
An inline iterator is an iterator that's always inlined by the compiler leading to zero overhead for the abstraction, but may result in a heavy increase in code size. Inline iterators are second class citizens; They can be passed as parameters only to other inlining code facilities like templates, macros and other inline iterators.
I want to pass one inline iterator (tables.values) to another inline iterator (myiter). Using a .closure. in this case means that I need to create a closure wrapper for every iterator in system.nim?
I just want to avoid copying overhead for iterable data structures like tables:
import tables, sequtils
let d = {1: 4, 2: 5, 3: 6}.toTable
# current implementation:
iterator myiter[T](anotheriter: varargs[T]): T =
for i in 0..<3:
for j in anotheriter:
yield j
for i in myiter(toSeq(d.values())): echo i
# optimized variant doesn't compile:
iterator myiteropt[T](anotheriter: iterator): T =
for i in 0..<3:
for j in anotheriter:
yield j
for i in myiteropt(d.values()): echo i
@def
That is some decent work, there. Would in be possible to have an automatic convergence of inline iterators to closure iterators? Honestly I think something like this should at some point replace sequtils entirely.
@def this is it, thanks! Looks like this should be inside sequtils.
Would in be possible to have an automatic convergence of inline iterators to closure iterators?
Nim allows to write pattern-based optimizations, someone skilled could implement this:)
UPD. But there is an issue.
#!/usr/bin/env python
def myiter(anotheriter):
for i in range(0, 3):
for j in anotheriter:
yield j
d = {1: 4, 2: 5, 3: 6}
# output: [4, 5, 6, 4, 5, 6, 4, 5, 6]
print([x for x in myiter(d.values())])
import future, tables
let d = {1: 4, 2: 5, 3: 6}.toTable
template toClosure*(i): auto =
iterator j: type(i) {.closure.} =
for x in i:
yield x
j
iterator myiteropt(anotheriter: iterator): auto =
for i in 0..<3:
for j in anotheriter():
yield j
# output: @[4, 5, 6]
echo lc[x | (x <- myiteropt(toClosure(d.values()))), int]
# expected @[4, 5, 6, 4, 5, 6, 4, 5, 6]
Not sure if this is a bug, but:
template toClosure(i): auto =
iterator j: type(i) {.closure.} =
for x in i:
yield x
j
iterator myiteropt(anotheriter: iterator): auto =
for i in 0..<3:
for j in anotheriter():
yield j
proc x[T](a: openArray[T]) =
for i in myiteropt(toClosure(a.items())): # Error: illegal capture 'a'
echo i
x([1,2,3])
# this compiles
# for i in myiteropt(toClosure([1,2,3].items())):
# echo i
But there is an issue.
I know you've seen this fixed in my repo, but for everyone else, it works if you deepCopy the iterator:
import future, tables
let d = {1: 4, 2: 5, 3: 6}.toTable
template toClosure*(i): auto =
iterator j: type(i) {.closure.} =
for x in i:
yield x
j
iterator myiteropt(anotheriter: iterator): auto =
for i in 0..<3:
var iter: type(anotheriter)
iter.deepCopy(anotheriter)
for j in iter():
yield j
echo lc[x | (x <- myiteropt(toClosure(d.values()))), int]
Not sure if this is a bug
It works with a seq, openarrays can't be captured:
template toClosure(i): auto =
iterator j: type(i) {.closure.} =
for x in i:
yield x
j
iterator myiteropt(anotheriter: iterator): auto =
for i in 0..<3:
for j in anotheriter():
yield j
proc x[T](a: seq[T]) =
for i in myiteropt(toClosure(a.items())):
echo i
x(@[1,2,3])
I decided to use closure iterator as optimization to avoid allocation of temporary array. But in my tests closure iterator is a slowest possible solution, it is even slower than converting iterator to sequence.
# nim c -d:release "test.nim" && ./test 20000
import os, strutils, times
proc printf(formatstr: cstring) {.importc: "printf", varargs,
header: "<stdio.h>".}
template bench(descr, body) =
block:
let timeStart = cpuTime()
body
printf("%-30s %.3f\n", descr, cpuTime() - timeStart)
iterator testIter(): int =
for i in 1..parseInt(paramStr(1)): yield i
template toClosure*(i): auto =
iterator j: type(i) {.closure.} =
for x in i:
yield x
j
iterator combinations1[T](x: varargs[T]): (T, T) =
for i in x:
for j in x:
yield (i, j)
iterator combinations2[T](it: iterator(): T): (T, T) =
var iter1: type(it)
iter1.deepCopy(it)
for i in iter1():
var iter2: type(it)
iter2.deepCopy(it)
for j in iter2():
yield (i, j)
template combinations3(it: expr, body: stmt) {.immediate.} =
for i in it:
for j in it:
let u {.inject.} = i
let v {.inject.} = j
body
bench("convert to seq"):
var res = 0
var s: seq[int] = @[]
for i in testIter():
s.add(i)
for u, v in combinations1(s):
res += u * v
echo res
bench("closure iterator"):
var res = 0
for u, v in combinations2(toClosure(testIter())):
res += u * v
echo res
bench("template"):
var res = 0
combinations3(testIter()):
res += u * v
echo res
Results for N=20000, gcc optimization level -O2:
40004000100000000
convert to seq 0.367
40004000100000000
closure iterator 2.049
40004000100000000
template 0.304
Results for N=20000, gcc optimization level -O3 (template version is twice slower!):
40004000100000000
convert to seq 0.366
40004000100000000
closure iterator 2.054
40004000100000000
template 0.610
Clang results are also interesting (-02 or -O3):
40004000100000000
convert to seq 0.934
40004000100000000
closure iterator 2.359
40004000100000000
template 0.002
So my questions are:
Iirc you basically throw the stack frame onto the heap when using a closure, I think nim stores them as a tuple of pointers to the proc and the environment respectively. Then you copy them for each iteration of the inner loop which isn't exactly great if you want to run fast. It might be a lot closer if you could reset the inner iterator directly but I am not sure whether that is possible in nim without going into unsafe territory. The third example is literally translated into something like:
for i in testIter():
for j in testIter():
var u {.inject.} = i
var v {.inject.} = j
res += u * v
This could bloat the code if the iterators are complex and it isn't as powerful as first class iterators but it is unsurprisingly a lot faster.
I feel like it should be possible to solve this with a macro but I am not entirely sure how and only tried for a couple minutes. My first idea was some kind of partial macro application - to use a macro which creates a macro which can be called with a second iterator and code block. But I am not sure how to add the first iterator call in there, just throwing it into the second macro's ast doesn't work because then the macro tries to use the for loops instead of creating them. Grabbing the nodes from the call site and capturing them in the inner macro with something like newNimNode(nnkForStmt).add ident"i", iteratorCall... would be ideal but for some reason the everything started breaking when I tried. But then, I am awful with macros so it probably is possible.
Edit: Fun fact, if you create a macro within a macro the inner one can capture the result variable and break things.
macro outer(name, first: untyped): stmt =
var
loops = newNimNode(nnkForStmt).add(
newNimNode(nnkAccQuoted).add(ident"arg1"),
first,
newNimNode(nnkForStmt).add(
newNimNode(nnkAccQuoted).add(ident"arg2"),
newNimNode(nnkAccQuoted).add(ident"second"),
newStmtList().add(
newNimNode(nnkAccQuoted).add(ident"exp")
)
)
)
quoted = newCall(
ident"quote",
newNimNode(nnkDo).add(
newEmptyNode(),
newEmptyNode(),
newEmptyNode(),
newEmptyNode(),
newEmptyNode(),
newEmptyNode(),
newStmtList(loops)
)
)
result = newStmtList().add(
newNimNode(nnkMacroDef).add(
name,
newEmptyNode(),
newEmptyNode(),
newNimNode(nnkFormalparams).add(
ident"untyped",
newNimNode(nnkIdentDefs).add(
ident"second",
ident"arg1",
ident"arg2",
ident"exp",
ident"untyped",
newEmptyNode()
)
),
newEmptyNode(),
newEmptyNode(),
newStmtList(newNimNode(nnkReturnStmt).add(quoted))
)
)
outer(myIt, countdown(2, 0))
myIt(countup(0, 2), x, y):
echo x, " ", y
#results in
# StmtList
# ForStmt
# StmtList
# Ident !"x"
# Call
# Sym "countdown"
# IntLit 2
# IntLit 0
# ForStmt
# StmtList
# Ident !"y"
# StmtList
# Call
# Sym "countup"
# IntLit 0
# IntLit 2
# StmtList
# StmtList
# Command
# Sym "echo"
# Ident !"x"
# StrLit
# Ident !"y"
This kinda works but is ugly. Is there a better way to use these types of macros?Thanks for your response.
IIUC this macro also expands at callsite and behaves like inline iterator. I'm working on graph algorithms and some iterators are really big. But I'm not sure if it is possible to create fast non-inline iterator. Probably it should be proc with callback argument.
Example of big iterator: https://gist.github.com/scriptum/34933a248dc22d0dd516#file-digraph-nim-L158
This is depth-first search algorithm and I cannot make it work with iterator as source of nodes. Fortunately it not used frequently and inlining shouldn't hurt much.
# I want to have flexible interface of DFS traverse that accepts iterator or array or single node
for u, v in graph.dfs("source_node"): # ok
...
for u, v in graph.dfs(["source_node1", "source_node2"]): # ok
...
var t = {"source_node1": 1, "source_node2": 2}.toTable
# no way to do this!
for u, v in graph.dfs(t.keys()):
...
for u, v in graph.dfs(t.values()):
...
Currently I prefer python+networkx for small graphs because it is flexible enough with iterators.