Consider this:
import tables, strtabs, lists
type
KnowledgeGraph = ref object
nodes: TableRef[string, StringTableRef]
links: TableRef[string, seq[SinglyLinkedNode[string]]]
Is it necessary to use TableRef`s and other `ref usage in the fields of a ref object or does everything already get passed by reference? I want to learn how to build a graph database/query system that works with a lot of data. From what I read I should be passing an object like this by reference than value.
Thoughts?
coming across this in the docs
Actually I am not sure if I explained that well enough in my book, at least I tried, see e.g. https://ssalewski.de/nimprogramming.html#_references_to_objects. Have not read the book of Mr. Rumpf yet, maybe he explains it better. Picheta does not.
TableRef and other ref usage in the fields of a ref object
My general advice is to use ref objects only when really needed, e.g. when you build tree like structures with many to one references. Or for example when you build mesh like structures, i.e. for a spatial triangulation (Delaunay). Use of ref is in no way related to performance of passing parameters to procs, as Nim passes larger object automatically by hidden pointers. And inside a ref object, you don't have to use refs for the fields. For a Table, you would use a table ref, when you need reference semantic, that is when you do not want a copy of the content by assignment. Always think of C pointers when using refs, so when you really have to use pointers in C for a complex data structure, then you would have to use references in Nim. Nim references are managed pointers, you have not to care for freeing the memory after use.
Have not read the book of Mr. Rumpf yet, maybe he explains it better.
I doubt it, I never think about it.
Note: var parameters are never necessary for efficient parameter passing. Since non-var parameters cannot be modified the compiler is always free to pass arguments by reference if it considers it can speed up execution.
It is a strange thing to be told that to understand something in Nim you first have to go and understand it in another language.
Yes, I agree, and that was the reason that in 2019 I decided to write a book for kids and beginners. I still think that it was a good idea to write it, and I think that Nim is in principle a good language for beginners. On the other hand, some people said, that most Nim users just came from other languages, and that my book explains simple things in too much detail. And most beginners just refuse to read the book, some even refuse to read the official tutorial or the other tutorials, and think beginners can learn Nim programming just by asking in IRC. Maybe the reason is, that for a long time Nim was just sold as a better Python, and that Nim looks, and mostly is, so easy. For C++, Rust or Haskell, no one would doubt that reading books is a good idea.
We avoid ref object unless semantics explicitly require it for the type (ie a linked list or a tree) - if we need ref semantics, we annotate the sites where it's used instead (ref X vs X = ref object).
https://status-im.github.io/nim-style-guide/language.refobject.html
In this specific example, we would not have used TableRef or ref object - instead, we would have made a regular object with Table inside, then used ref KnowledgeGraph if we wanted to share a graph between multiple owners - shared mutable ownership is of its own problematic, so again, unless there's a specific reason, we tend to avoid that too.
The biggest downside of this approach is that you often end up having to use ref more than the code would require - in particular when you need read-only reference semantics - there's no "good" way of getting a read-only reference in Nim (ie ref Xxx is always mutable).
Nim automatically passes by reference when it's faster than passing by copy
Are you sure? As one/two years ago that was not the case. And how Nim could do that? As you need to collect and analyse runtime statistics about usage patterns and then decide if passing by value (copying) should be changed to passing by reference. Could Nim do that now?
My question and objections were to this line:
Nim automatically passes by reference when it's faster than passing by copy, so no you do not need to always use ref.
It seems wile maybe in some cases compiler optimises how data passed around, in other cases it does not. For practical purposes it basically means that you still had to use ref.
use ref even if you don't need its semantic.
Actually, we have the byRef and byValue pragma to change how the compiler passes proc parameters:
https://nim-lang.org/docs/manual.html#foreign-function-interface-bycopy-pragma
We have to attach these to the object/tuple, so we can not modify the behaviour for individual procs, but at least for a data type.
In some cases compiler optimises how data passed around, in other cases it does not. For practical purposes it means that you can't just declare data as value and let the compiler figure out how to pass it around efficiently.
All it means is that you didn't understand it. Others can use the system successfully, in practice.
The only thing that you need to understand is that explicit intermediate results do materialize and they have to for logical reasons:
var x = tab["key"].field["keyB"] # copies because
x += 4 # does not mutate tab["key"].field["keyB"]
This is how numbers work in every commonly used language including Python and JavaScript. It's just that Nim is more consistent about it and treats object like an int. But I know that nobody really cares about "consistency" when everybody comes from existing languages which are not consistent; the only thing that really matters is familiarity.
There is an optimizer that optimizes out the materialization but it's rather recent and is still getting better at it. The optimizer cannot change the semantics though so if you mutate afterwards, you will get a copy. And you will always get a copy, it's just that witht ref object only the pointer is copied so that's why it's rather cheap and mutations do "write through".
@Araq, thanks for the explanation.
The optimizer cannot change the semantics though so if you mutate afterwards, you will get a copy.
Yes, but if you don't mutate, copy-on-write optimisation can do the trick. And allow to pass around large value objects by reference, without breaking the value semantic.
But I know that nobody really cares about "consistency" when everybody comes from existing languages which are not consistent; the only thing that really matters is familiarity.
I am learning Nim and I'd say I've recognized this consistency goal in some places and that feels awesome. Like, if things are this way, there is a very thoughtful and deliberate design effort here. ❤️
But if you don't mutate, copy-on-write optimisation may help?
Copy-on-write requires a reference count and Nim's objects don't have any (ref do but they already offer reference semantics). So the optimizer does it all at compile-time, no reference counter is required. On the one hand this does imply though that it is not as powerful as a runtime mechanism. On the other hand that makes the optimizer so general that is also optimizes away reference count updates for ref.