Two versions of an lru cache, the first one scrambled from bits and pieces from the web a few years ago. The second one from this morning when I thought, why not use the orderedTable for this? Now I don't know whether I should be disappointed or did I miss something essential (or is it a just stupid choice)? The orderedTable one is rather slow.
Some insight is appreciated.
Benchmark, rendering 512x512 cells of Worley noise, caching nearby neighbour feature points:
lrucache1 cache size: time 1: (seconds: 0, nanosecond: 992988300) 2: (seconds: 0, nanosecond: 948481100) 3: (seconds: 0, nanosecond: 974308700) 4: (seconds: 0, nanosecond: 956377000) 5: (seconds: 0, nanosecond: 973190400) 10: (seconds: 0, nanosecond: 461624800) 20: (seconds: 0, nanosecond: 468986500) 50: (seconds: 0, nanosecond: 468143900) 100: (seconds: 0, nanosecond: 473003100) 1000: (seconds: 0, nanosecond: 481897400) 5000: (seconds: 0, nanosecond: 470832200)
lrucache2 cache size: time 1: (seconds: 1, nanosecond: 65394100) 2: (seconds: 1, nanosecond: 96497500) 3: (seconds: 1, nanosecond: 98379700) 4: (seconds: 1, nanosecond: 172149600) 5: (seconds: 1, nanosecond: 224398200) 10: (seconds: 1, nanosecond: 188621100) 20: (seconds: 1, nanosecond: 604642700) 50: (seconds: 3, nanosecond: 18123300) 100: (seconds: 5, nanosecond: 737862600) 1000: (seconds: 7, nanosecond: 992100700) 5000: (seconds: 7, nanosecond: 604643100)
lrucache1.nim ---%<------%<------%<---
import tables, lists
type
LRUCache*[K, V] = ref object
capacity: int
cache: Table[K, V]
use: DoublyLinkedList[K] # Track use order
nodeMap: Table[K, DoublyLinkedNode[K]] # Maps keys to list nodes
proc newLRUCache*[K, V](capacity: int): LRUCache[K, V] =
result.new()
result.capacity = capacity
result.cache = initTable[K, V]()
result.use = initDoublyLinkedList[K]()
result.nodeMap = initTable[K, DoublyLinkedNode[K]]()
proc get*[K, V](cache: LRUCache[K, V], key: K): tuple[value: V, found: bool] =
if cache.cache.hasKey(key):
let node = cache.nodeMap[key]
# Update use, move to front of list
cache.use.remove(node)
cache.use.prepend(node)
return (cache.cache[key], true)
else:
var defaultVal: V
return (defaultVal, false)
proc put*[K, V](cache: LRUCache[K, V], key: K, value: V) =
# If key already exists, update it and move to front
if cache.cache.hasKey(key):
cache.cache[key] = value
let node = cache.nodeMap[key]
cache.use.remove(node)
cache.use.prepend(node)
return
# Prune
if cache.cache.len >= cache.capacity:
let lruKey = cache.use.tail.value
cache.cache.del(lruKey)
cache.nodeMap.del(lruKey)
cache.use.remove(cache.use.tail)
cache.cache[key] = value
let node = newDoublyLinkedNode[K](key)
cache.use.prepend(node)
cache.nodeMap[key] = node
---%<------%<------%<---
lrucache2.nim ---%<------%<------%<---
include tables
type
LRUCache*[K, V] = ref object
capacity: int
cache: OrderedTable[K, V]
proc newLRUCache*[K, V](capacity: int): LRUCache[K, V] =
new(result)
result.capacity = capacity
result.cache = initOrderedTable[K, V]()
proc get*[K, V](cache: LRUCache[K, V], key: K): tuple[value: V, found: bool] =
if cache.cache.hasKey(key):
let value = cache.cache[key]
# Move most recently used position
cache.cache.del(key)
cache.cache[key] = value
return (value, true)
else:
var defaultVal: V
return (defaultVal, false)
proc put*[K, V](cache: LRUCache[K, V], key: K, value: V) =
# If key exists, remove it first so it gets added at the end
if cache.cache.hasKey(key):
cache.cache.del(key)
# If at capacity, remove the oldest item, first in ordered table
if cache.cache.len >= cache.capacity: # and cache.cache.len > 0:
let oldestKey = cache.cache.data[cache.cache.first].key
cache.cache.del(oldestKey)
# (re)add item
cache.cache[key] = value
---%<------%<------%<--- FWIW both versions look terrible to me. ;-)
Use sink and lent annotations, use default(T) for clarity. Use object construction instead of new.
But yeah, probably you need a real algorithm, not some hack on top of OrderedTable which wasn't designed for this.
Hi ingo,
your algo is the by-the-book version of a LRU-Cache. It's hard to make concurrent, cos' of the doubly-linked-list. A more modern approach was presented by Hiroshi Inoue named Multi-step LRU. It uses SIMD and should perform much better. I did a nim-version for 64-bit keys and values, but that can be adopted to whatever you want to put into the Cache.
greets Bosinski
Use sink [...]
added sink to the K's and V's and the capacity: int. On the simplest noise that is heavier influenced by the cache it is ~6% slower with sink. Is that to be expected?
@Bosinski I'll have a look at that. Thanks.