nimforum mirror - Why newSeq, but initTable, initSet, etc.?

omaranto (orginal) [2015-03-06T01:14:04+01:00] view original

Is there any reason for the inconsistency in naming? If not, has there been any thought to changing the names to make them more uniform.

(Of course, this hardly matters at all and is not really a problem, but it would feel nicer if the names were consistent.)

Jehan (orginal) [2015-03-06T01:17:17+01:00] view original

newT is for functions that return references, initT for functions that return values.

Note how there's both a newTable() and an initTable().

def (orginal) [2015-03-06T01:20:15+01:00] view original

But seqs have value semantics, not ref semantics.

Jehan (orginal) [2015-03-06T01:32:19+01:00] view original

def: But seqs have value semantics, not ref semantics.

They can have either.

var a = @[1]
shallow a
var b = a
inc a[0]
echo a
echo b

Araq (orginal) [2015-03-06T01:32:32+01:00] view original

newString and newSeq predate this naming convention.

filcuc (orginal) [2015-03-06T01:37:43+01:00] view original

And what about object constructors that should be call by subclasses? Is there a proper naming convention and signature?

omaranto (orginal) [2015-03-06T02:14:47+01:00] view original

Thanks, Jehan. I didn't know about the reference naming convention or about shallow.

Also, thanks Araq, I had figured it was probably historical accident for newSeq, but it's good to hear it straight from the horse's mouth.

didlybom (orginal) [2015-03-06T11:56:05+01:00] view original

newString and newSeq predate this naming convention.

Aren't there any plans to change these (or at least provide initSeq and initString aliases for them and mark the original ones as deprecated)?

I guess that would break pretty much all nim code out there though, but fixing it shouldn't be that hard...

novist (orginal) [2015-03-06T15:43:40+01:00] view original

See i knew this would get confusing and indeed it does. Maybe its a good time to reiterate on the need of standartized way to construct/initialize objects?

Araq (orginal) [2015-03-06T15:59:02+01:00] view original

@novist

Your macro stuff still distinguishes between ref and object and is just as "confusing".

novist (orginal) [2015-03-06T16:27:58+01:00] view original

@Araq: how is it confusing? My macro actually does not distinguish ref and object types. It uses object types to construct both objects and refs. Yes, there are two macros doing two different things, hoever its immediately obvious what is being created because it says explicitly what is constructed by using same type. I do not understand purpose of having to define object type and then ref type of that object. But hey its just a prefference. I can argue that having PType or TypePtr or TypeRef is much more confusing than calling explicit initalization operator that says "you are making dang ref yo". And finally real problem is actually lack of implicit constructor call that could allocate and set type up if there is constructor proc defined. And i agree that macro is not a solution. Not by a long shot. Its more like a neat workaround to fake something that is missing. Im not sure why you seem to be reluctant to give this problem some thought and love. It surely is possible to do this in a least intrusive and most elegant way.

jboy (orginal) [2015-03-09T08:53:47+01:00] view original

[Note: This post (and this thread) is about consistent naming schemes for already-existing explicitly-invoked creation functions in the Nim library. I consider this is a separate issue to that of automatic constructors and new constructor-specific keywords that is discussed in this other thread: http://forum.nim-lang.org/t/703/5 ]

When I look through the system, tables and sets modules, it seems to me that part of the source of confusion is that there are actually 8 distinct examples of functionality, covered by just 2 naming convention prefixes, new and init (and additionally, as noted, newSeq and newString predate the naming convention):

Allocate a new object of type T on the heap, zero it, and assign a ref to this object into a var parameter that will modify a pre-declared variable.

Allocate a new object of type T (specified by typedesc) on the heap, zero it, and return a ref to this object.

Return a container instance "by value" (ie, with value semantics) that has been pre-sized to a certain size (and the entries have been zeroed), eg, newSeq[T](len).

Return a container instance "by value" (ie, with value semantics) that has been pre-sized to a certain size (but the entries are uninitialised), eg, creation of strings using newString(len).

Allocate a new object on the heap and initialise it with the supplied information and return a ref to it, eg, newException[](...).

Allocate a new container instance that is initialised as empty and return a ref to it, eg, newTable.

Create a new container instance that is initialised as empty and return it "by value", eg, initTable, initSet.

Initialise a new object of type T that has been passed in as a var parameter, eg, init(var HashSet).

These 8 examples of functionality can be partitioned into 4 general groups:

G1: Allocate a new object on the heap (by type), default-initialise it, and return a ref.

G2: Allocate a new object on the heap, initialise it with the supplied arguments, and return a ref.

G3: Create a new object, initialise it with the supplied arguments and return it by value.

G4: Initialise the object in a pre-declared variable that is passed as the first argument (whether the variable is a ref or not).

I think the confusion might be decreased if there were 4 (rather than 2) prefixes used for these 4 groups of functionality. To pick 4 arbitrary but commonly-used prefixes...

new: Allocate a new object on the heap (by type), default-initialise it and return a ref (G1).
For example: var f: ref Foo = new(Foo)

newX: Allocate a new object on the heap, initialise it with the supplied arguments and return a ref (G2).
For example: var f: ref Foo = newFoo(a, b)

createX: Create a new object, initialise it with the supplied arguments and return it by value (G3).
For example: var f: Foo = createFoo(a, b)

init (or initX if more information is necessary): Initialise the object in a pre-declared variable that is passed as the first argument (whether the variable is a ref or not) (G4).
For example (not a ref): var f: Foo; init(f, a, b) or var f: Foo; f.init(a, b).

Alternately (a ref): var f: ref Foo; init(f, a, b) or var f: ref Foo; f.init(a, b)

These prefixes should be de-coupled from container-specific allocation behaviour such as "alloc but don't initialise" and "alloc a capacity but initialise as empty" that is provided for strings. These could be controlled by another set of suffixes like NoInit and OfCap. (It might make sense to provide corresponding procs for seqs too.)

Looking through the docs for the system module, the following name changes would be applied:

1. new[T](a: var ref T) (link to docs)
-> initNew[T](a: var ref T)

2. new[](T: typedesc): ref T:type

-> new[](T: typedesc): ref T:type (No change)

3. new[T](a: var ref T; finalizer: proc (x: ref T))

-> initNew[T](a: var ref T; finalizer: proc (x: ref T))

4. newSeq[T](s: var seq[T]; len: int) (link to docs)

-> init[T](s: var seq[T]; len = 0)

5. newSeq[T](len = 0): seq[T]

-> createSeq[T](len = 0): seq[T]

6. newString(len: int): string (link to docs)

-> createStringNoInit(len: int): string

7. newStringOfCap(cap: int): string

-> createStringOfCap(cap: int): string

8. newException[](exceptn: typedesc; message: string): expr (link to docs)

-> newException[](exceptn: typedesc; message: string): expr (No change)

Likewise, looking through the tables module:

9. newTable[A, B](initialSize = 64): TableRef[A, B] (link to docs)
-> newTable[A, B](initialSize = 64): TableRef[A, B] (No change)

(Note: It remains as newTable rather than createTable, because the type TableRef that is returned "by value" is transparently a ref to a Table.)

10. initTable[A, B](initialSize = 64): Table[A, B] (link to docs)

-> createTable[A, B](initialSize = 64): Table[A, B]

And finally, looking through the sets module:

11. init[A](s: var HashSet[A]; initialSize = 64) (link to docs)
-> init[A](s: var HashSet[A]; initialSize = 64) (No change)

12. initSet[A](initialSize = 64): HashSet[A]

-> createSet[A](initialSize = 64): HashSet[A]

(Whew, that ended up longer than expected...)

Araq (orginal) [2015-03-09T10:44:44+01:00] view original

@jboy I have nothing against your proposal, but I cannot see how this makes the language easier to learn in practice and you haven't even covered the open (open file...) category of constructors. And again, after you constructed a type, you have to use it. This means you need to rely on your IDE to explore it or read the docs or you remember how it works. Yes, sometimes you have to learn things.

jboy (orginal) [2015-03-09T14:50:22+01:00] view original

I would say that this makes the language easier to learn because it defines standard, consistent "verbs" for the various procs in the standard library, which programmers can rely upon when encountering new modules, and which make it easier to learn the language due to consistencies that can be learned (and that will ultimately become idiomatic, just like len to obtain the length of a collection).

As I see it, this result is a combination of 3 specific effects:

There are recognisable idiomatic verbs shared consistently between different types, to operate upon those types in the same way. This is just like using len for lengths (Why len and not sometimes length or size or getLength? For consistency!), or add for appending elements/chars to seqs/strings, or read & write for getting/putting bytes/chars/etc from files/buffers/etc.

When you see create vs new, you can guess what it will be returning, even if you're unfamiliar with this particular proc. Likewise, if you want to return a ref rather than a value, you can guess that the proc you want will begin with new rather than create.

Finally, it makes the calling conventions unambiguous: When you see create or new, you know it will return the new value to you; when you see init, you know you supply the uninitialised variable as the first argument.

Yes, there's always learning, but there's learning idioms & recogniseable patterns vs learning inconsistencies (because "that's just the way it is") and not being able to rely upon your intuition when there's an unknown proc (or you're trying to guess which proc you should use).

As to the open category of constructors, I don't see any problems extending this naming scheme beyond data-structures to include file-constructors.

The open/close idiom, to obtain & release I/O resources (files, file descriptors, sockets) is so familiar that I think it makes sense to retain open & close as keyword components in proc names -- much like new is the idiomatic keyword to allocate an object of type T on the heap.

I suspect that renaming the familiar open(filename, ...): File to something like openFile would cause more confusion than it would solve. That said, perhaps openFile would make sense as an alias that adheres to the naming scheme, for the form of open that doesn't take a var File parameter.

So open(filename: string; mode: FileMode = fmRead; bufSize: int = - 1): File would gain an alias openFile(filename: string; mode: FileMode = fmRead; bufSize: int = - 1): File.

This is analogous to newTable that allocates you a new Table.

And to open a socket, it would be openSocket, etc.

3. For the two forms of open that take a var File parameter, I would suggest that their names should begin with init for consistency with all the other init-an-uninitialized-variable procs.

So open(f: var File; filename: string; mode: FileMode = fmRead; bufSize: int = - 1): bool should become init(f: var File; filename: string; mode: FileMode = fmRead; bufSize: int = - 1): bool.

No need for File in the name, because it's already known unambiguously from the first parameter.

This is similar to my previous suggestion of initNew[T](a: var ref T) from new[T](a: var ref T). init for calling convention + open/new for the constructor operation.

An argument could also be made for just init rather than initOpen, by analogy with init[T](s: var seq[T]; len = 0). But I think that the analogy of new -> initNew is a better analogy for open.

I searched through the system module docs, for more occurrences of the string : var, but I didn't see any more constructor categories... If there are any that I've missed, please point them out.

I did think that setShallow[T](s: var seq[T]) would make more sense than shallow[T](s: var seq[T]) (because then you have setLen, setShallow, etc.), but that's tangential.

Mirror of forum.nim-lang.org

982 :: Why newSeq, but initTable, initSet, etc.?