Hello, I wish to use a Table of seq[int] to store values associated to some keys across different .txt files. I learn how many files I need to scan at runtime, so I was thinking to declare the size of seq of each table entry with:
newSeqUninitialized[int](n)
However it seems I need an additional step of seq init to make it work
import tables
let
files = @["first","second","third"]
n = files.len
emptyseq = newSeqUninitialized[int](n)
var
table = initTable[string, newSeqUninitialized[int](n)]()
i = 0
for file in files:
# keys fetching stuff... now I need to store content of key1
if not ("key1" in table):
table["key1"] = emptyseq #without this extra init it fails with [KeyError]
table["key1"][i] = 2*i
i.inc
echo table #{"key1": @[0, 2, 4]}
Am I missing something, or do I need always to assign the whole (empty) seq the first time I insert a new key, in order to assign its elements, despite var table declaration? Thank youYes you need to put the value in the table before operating on it. table[key][i] = is not equal to table[key] = @[]. In the first case there is no sequence to index on, since there is no entry for the key. The following is a slightly more elegant method of doing this.
if table.hasKeyOrPut(key, emptySeq):
table["key1"][i] = 2*i
@ElegantBeef
Is there a way to overload [] to provide a default if the key doesn't exist? Something like:
proc `[]`(table: var Table[string, seq[int]], key: string): var seq[int] =
if table.hasKey(key):
result = table[key]
else:
result = newSeq[int]()
This fails with expression has no address.
Yes you can, you need to add it to the table so the result has an address:
import std/tables
proc `[]`(table: var Table[string, seq[int]], key: string): var seq[int] =
discard table.hasKeyOrPut(key, @[])
tables.`[]`(table, key)
var a = initTable[string, seq[int]]()
echo a["hello"]
a["hello"].add 20
echo a["hello"]
Is there a way to overload [] to provide a default if the key doesn't exist?
You really should not do that. It's a design mistake in other languages too.
@tcheran - all you need is to change your main loop to this (NOTE: the i = 0 is also unneeded):
import tables
let
files = @["first","second","third"]
n = files.len
initSeq = newSeqUninitialized[int](n)
var table = initTable[string, newSeqUninitialized[int](n)]()
for i, file in files:
table.mgetOrPut("key1", initSeq)[i] = 2*i
echo table #{"key1": @[0, 2, 4]}
@cblake I didn't know the pairs iterator trick... that's really nice. If I understand you point correctly, you suggest to not use newSeqUninitialized[int](n) because it's a bit exotic (maybe it will disappear in future Nim versions?), not so relevant peformance-wise, and newSeq[int](n) is a solid, reliable and well defined replacement. OK, suggestion accepted! Actually I tested the mgetOrPut solution with my real case that is more like this
table.mgetOrPut(key, emptySeq)[i].inc(1)
#increment the key counter at the file-related i column position
and worked beautifully. Thank you. Apparently also newSeqUninitialized[int](n) is filled with n x zeroed entries (and it's much longer to write!):
let
n = 3
a = newSeq[int](n)
b = newSeqUninitialized[int](n) #and in addition it does not work with non int type
c = newSeqOfCap[int](n)
echo a # len == n, all entries set to 0
echo b # len == n, all entries set to 0, too
echo c # len == 0
assert a == b
Memory that a process gets from an OS is almost always zeroed (and sometimes is actually a copy-on-write all zero virtual memory page replicated however many times is needed, depending upon the OS).
However, what newSeqUninitialized in Nim gives you will depend upon the history of the memory in your process. Using uninitialized memory that happens to be 0 while you are testing but winds up being otherwise later is often considered one of the (many) "gotchas" of C/C++ programming.
You should probably not use newSeqUninitialized in Nim unless you really know what you are doing and it is an important optimization as revealed by profiling your code in the context of your problem.