The documentation at http://nim-lang.org/docs/tables.html#[],Table[A,B],A states:
retrieves the value at t[key]. If key is not in t, default empty value for the type B is returned and no exception is raised. One can check with hasKey whether the key exists.
This seems a very odd design choice, for at least 2 reasons:
Default values are document here. It is nil for seq[string].
As to the rationale, I don't disagree that the code may need some overhaul. However, if table.hasKey(key): a = table[key] is not a common use case. You generally want one of the following (or a variant thereof):
if not table.hasKey(key):
table[key] = value
or
if table.hasKey(key):
result = table[key]
else:
result = someOtherValue
The first case is supported by mgetOrPut(). The second works sort of for reference types if nil can be used to represent an undefined value of the underlying value type.
Ideally, of course, we want something like:
proc tryGet*[A,B](table: Table[A, B], key: A): (bool, B)
## returns (true, value) if table[key] == value, (false, default(B)) otherwise.
I suspect the original poster's context is probably covered by mgetOrPut with an empty seq. Another little atom of logic is that you can call len(tab) before and after to decide if an insertion actually happened if that isn't also obvious from the put value.
Note, there is also hasKeyOrPut() which is sort of halfway between Jehan's two cases.
All that said, the proposed tryGet would be very short in terms of the internal rawGet(). Maybe it would be better still to simply export "rawGet", perhaps renaming it "find". This would have a side-effect of making the "find/contains" tables API match the general rules outlined in "doc/apis.txt".
My use case was "guarding against accessing non-existent elements in a table".
Doesn't the "default values" thing mean it's impossible to add nil elements to the table and still make a distinction between non-existent element and a element which is nil? Yes, this is sometimes useful :)
Also, a minor exaggeration here is that a repeated traversal cuts the performance in half. That is only true when the whole table (or most of it) is usually in the L1 CPU cache. This qualification is important since cache loads are almost always the bottleneck in hash lookups.
Even in that particular L1-cachable circumstance, the lookup is already very fast. Doubling its relative cost may well yield very minor absolute cost in the context of the broader program. I.e., this is usually more of a 5..10% scale optimization than a 2X one, even focusing on just the hashing part.
All that said, note that regular mget() raises an exception if the key is missing rather than returning a default value. So, you could also do "try: a = table.mget(key) except: errorpath" to guard against a missing key and/or distinguish between nil values and missing keys.
I fully agree with the concerns of the OP -- and this is not just because I have just lost a lot of time because of this behavior :). I simply have never seen a language offering a function table[key]: B with the proper return type and a non-throwing policy. Typically such a table lookup either throws or is wrapped into e.g. Option[B]. Since the type is not wrapped, it is a natural conclusion that it throws. Therefore it never occurred to me it just returns "defaults", leading to a somewhat unpleasant surprise...
Apart from that, doesn't the "return default" policy also prevent to ever use a not nil type in a table?
I think we should be careful with assumptions on rare use cases. So much depends on the application domain and the different backgrounds of Nim users. I'm switching from Scala, and as a result, in my code a rare use case is rather "updating a key/value in table", since I'm used to working with fully immutable data all the time. Conversely, my code is full of plain table lookups. In my machine learning algorithms, almost all my tables contain mainly scalar ints or floats, which does not play nicely with defaults. As a result hasKey + [] are all over the place. And I pass my tables around as immutable, so I also cannot use mget as workaround.
Would it be okay to make a PR offering at least an additional, say, get(key: A): Option[B] as soon as optionals are merged? Is the reason that [] does not throw performance related? In this the case, one might also consider renaming it to e.g. unsafeGet and making [] throw? Would make the transition from other languages easier.
proc `[]`[K,V](table: DefaultTable[K,V], key: K): V =
if not table.has_key(key):
table.put(key, table.defaultValue)
return table.get(key)
var t: initDefaultTable[int,int](42)
echo t[5]
That's similar to collections.defaultdict in Python, or Perl hashes, and extra Table type is fine. In these cases, the inefficiency of a second look-up is not a problem.