nimforum mirror - is it possible to rename a key in an orderedtable while retaining the order?

cricket (orginal) [2020-08-29T03:30:44+02:00] view original

hey ^^)/ just a question, is it possible to rename a key in an orderedtable while retaining the order?

here's what i tried first,

import tables

var animals = {"cat": "woof", "cow": "moo", "opossum": "hisssss"}.toOrderedTable
# i would like this order to stay

var tmp: string
discard animals.pop("cat", tmp)
animals["dog"] = tmp

echo $animals
# however, order is now cow, opossum, dog instead of dog, cow, opossum

what i thought i could do is split at that key, add the new key name and then append the other half of the split... i don't know how i'd go about doing this though

any help / comments would be appreciated

jibal (orginal) [2020-08-29T07:43:42+02:00] view original

Your code doesn't do anything like what you say you want to do; it simply removes the "cat" item and adds a "dog" item, which of course goes at the end.

The OrderedTable API does not offer mutable access to the keys, so there's really no way to do what you want. You could do something like this, but since the hash value isn't updated it will most likely break when more items are added to the table, so don't do this:

import hashes, tables

type StringBox = ref object
    s: string

proc box(s: string): StringBox = StringBox(s: s)

proc `$`(sb: StringBox): string = sb.s
#proc hash(sb: StringBox): Hash = hash(sb.s)

let catbox = "cat".box

var animals = {catbox: "woof", "cow".box: "moo", "opossum".box: "hisssss"}.toOrderedTable

catbox.s = "dog"
echo $animals

what i thought i could do is split at that key, add the new key name and then append the other half of the split... i don't know how i'd go about doing this though

Seems simple enough:

import tables

var animals = {"cat": "woof", "cow": "moo", "opossum": "hisssss"}.toOrderedTable

proc replaceKey[K, V](tab: OrderedTable[K, V], old, new: K): OrderedTable[K, V] =
  for (k, v) in tab.pairs:
    result.add(if k == old: new else: k, v)

echo $replaceKey(animals, "cat", "dog")

Note that, if the same key appears more than once, this will replace all occurrences.

miran (orginal) [2020-08-29T08:16:01+02:00] view original

Note that, if the same key appears more than once, this will replace all occurrences.
... result.add ....

add for tables is (finally) deprecated and IMO shouldn't be even mentioned to beginners. You should use []= instead so you don't introduce duplicate keys unintentionally. (Note that toOrdererdTable uses []= internally, so no duplicate keys there)


proc replaceKey[K, V](tab: OrderedTable[K, V], old, new: K): OrderedTable[K, V] =
  for k, v in tab.pairs:
    if k == old:
      result[new] = v
    else:
      result[k] = v

cricket (orginal) [2020-08-29T09:46:00+02:00] view original

thank you jibal;

im aware that my code doesn't do what i meant it to do, it was more of.. (an example?) because i dont think im so good at wording what i was trying to do haha

re: "This seems like an odd thing to want to do, though, if you're going to do it a lot--it will be slow and defeats the point of using a table. You might just want a seq of pairs." this makes sense to me; i think i was approaching what i was trying to do the wrong way

thank you again for the helpful response!

jibal (orginal) [2020-08-29T10:21:54+02:00] view original

Note that there are insertion-order data structures other than seq and OrderedTable that might perform better, depending on the details of your usage. One possibility is a Table[string, int] where the values are the indices into a seq ... when you add an entry, you add the value to the seq and add its index and key to the Table. To change the key, just remove the old key from the Table and insert the new one with the same value (seq index). So your example is then actually pretty close to what you want ...

type MyOrderedTable[K,V] = object
  keyToIndex: Table[K, int]
  vals: seq[V]

var animals: MyOrderedTable[string, string]

# [proc add(...) and initialization of animals omitted]

var index: int
if animals.keyToIndex.pop("cat", index):
  animals.keyToIndex["dog"] = index

for key, index in animals.keyToIndex:
   echo (key: key, val: animals.vals[index])

cblake (orginal) [2020-08-29T12:44:54+02:00] view original

@cricket - also note that perhaps non-intuitively (but well documented) proc pop*[A, B](t: OrderedTable, key: A,...) and its related del are O(n) operations where n here is the size of the t in question. Fixing that requires the intrusively merged linked list to go from singly- to doubly- linked and @Araq does not like to charge "non-delete workload" cases for things only needed by mixed workloads. (In this case the charge is space for the 2nd link.)

So, you should definitely use @jibal's fine final approach with a Table of indices into a seq if you don't need to iterate over keys in insert-order, but you want to iterate over the values in insert-order.

You can also store only the index in the Table and have a seq[(K,V)] which is often called a "compact table", but then you probably need to write your own hash table impl that will do the delete-re-insert in the "index/hash part" without disturbing the seq[(K,V)] part that is keeping insert order for you. I have an impl of this at https://github.com/c-blake/adix/blob/master/adix/olset.nim (I am in the middle of a kind of big re-org/re-write of all that stuff this weekend/early next week, but that code might help get you started.) That does not yet have an editKey operation yet, though it would be possible with this kind of set up. (Honestly, that operation is more rarely provided than add/allValues duplicate key stuff.) There is also https://github.com/LemonBoy/compactdict which may be more easy to amend/understand.

cblake (orginal) [2020-08-29T13:09:41+02:00] view original

Oh, and it maybe bears mentioning that if you are doing your own table impl (Nim stdlib does not expose data/a way to get the index and it would also change on any resize event) then you could also have a seq[(V,int)] where the integer is an index in the "table part" for the key going with the value.

The short answer to all this is what you are doing in a broader context probably matters to what you should do for this operation.

Araq (orginal) [2020-08-30T15:47:22+02:00] view original

Fixing that requires the intrusively merged linked list to go from singly- to doubly- linked and @Araq does not like to charge "non-delete workload" cases for things only needed by mixed workloads. (In this case the charge is space for the 2nd link.)

That's true but I think a more radical change of the implementation can give us O(1) deletions without the memory overhead. Haven't really thought about it though.

cblake (orginal) [2020-08-30T16:56:52+02:00] view original

That's also true. :-)

You probably mean the aforementioned OLSet / CompactTable which do that. That can get a lot of space savings since the sparse hash table part can be an array of much smaller objects - just the index.

Those smaller objects should really become a little larger (index, truncatedHashCode). You only need enough bits of the hash code to be a good comparison prefix to make loading from the seq[(K,V)] for a K compare only happen when it is very likely necessary - at the end of a "hit" and never for a "miss". Without such a trick, every probe must load from both arrays. Not sure when I'll get to that.

Mirror of forum.nim-lang.org

6744 :: is it possible to rename a key in an orderedtable while retaining the order?