nimforum mirror - Using this over tables?

Serenitor (orginal) [2017-10-09T21:50:27+02:00] view original

as I've been browsing through {NimDir}/lib/packages/docutils/rst.nim today, I've seen this type syntax that's been new to me:

Smilies = {
    ":D": "icon_e_biggrin",
    ":-D": "icon_e_biggrin",
    ":)": "icon_e_smile"
  }

that translates to array[0..2, (string, string)]

I only knew the (...) syntax to be used for tuples. I haven't been able to find anything about this in the docs.

as the usage in rst.nim kind of reminded me of associative arrays, or tables, dictionaries, or however you want to call them, I wonder, could I just use this type instead of hash tables?

proc `[]` [T](s: seq[(string, T)], k: string): T =
    for key, val in s.items:
        if key == k:
            return val

proc `[]=` [T](s: var seq[(string, T)], k: string, v: T) =
    for i in 0..s.high:
        if s[i][0] == k:
            s[i][1] = v
            return
    
    s.add((k, v))

var map = @{
    "lerk": "ey",
    "lurk": "eey",
    "lork": "eeey"
}

echo map["lurk"] # eey
map["lurk"] = "yee"
echo map["lurk"] # yee

would that make sense if I just need string keys? what do you think would the performance vs hash tables look like?

LeuGim (orginal) [2017-10-09T23:02:57+02:00] view original

Of coarse you can use it. But yes, performance will suffer, if you have more than several items in such a table (you iterate through half its length on average).

The syntax is described in the manual, and is the usual way to set initial values for hash-tables.

Serenitor (orginal) [2017-10-10T00:17:53+02:00] view original

dang, I haven't been looking in the tables module docs, but searched the manual and system module instead. thanks for the hint.

I hoped for a performance gain by skipping the hashing when I just need simple strings as keys.

mratsim (orginal) [2017-10-10T08:19:57+02:00] view original

I think you will probably lose perf.

Strings are on the heap with value semantics, meaning you would need to allocate slow memory in the for loop for each strings

For looping is O[n]

Hashes are done via xor and shift which are faster than additions and substractions or allocating strings.

Having a string as index means an extra indirection/dereferencing and no memory locality so more cache misses and no prefetching.

Jehan (orginal) [2017-10-10T19:55:46+02:00] view original

The following works:

import tables

const a = {"x": "Hello, ", "y": "world", "z": "!"}.toTable

echo a["x"], a["y"], a["z"]

Note the const part. The entire table (including keys and values) will actually be allocated in static memory.

Whether you gain performance by avoiding hashing is something that you have to consider on a case-by-case basis. It is unlikely that a linear search will be faster for all but very short tables. Alternative data structures for a set of string keys known at advance may be better (perfect hashing, tries or dfas, and so forth).

Keep also in mind that unless it's a hot path, it may not matter at all. Do some profiling first to check if it even matters for performance. You can do some seemingly expensive stuff for code that's only executed once or a few times and it won't show up at all on profiling. Your CPU can execute billions of instructions per second. Anything that takes less than a million and doesn't do that very often is just a rounding error in the grand scheme of things.

Mirror of forum.nim-lang.org

3237 :: Using this over tables?