nimforum mirror - Pointer types

andrea (orginal) [2015-02-02T16:40:28+01:00] view original

I am a little confused about the various types of (non-managed) pointers.

Apparently there are both ptr and ptr A - here I guess that ptr = ptr A (for some A).

Oddly enough, I can also declare ptr[A] and I am not sure whether this is different from ptr A, and why the type ptr has two different ways to declare a type parameter, unlike other types.

Now, when browsing the documentation on memfiles I have found the type pointer, which seems to be incompatible with ptr (I just got an error in a place where I had tried mixing the two).

Can anyone explain what are - if any - the differences between all these pointers?

Araq (orginal) [2015-02-02T16:55:54+01:00] view original

ptr without anything is a typeclass, perhaps it would have been better to use SomePtr for that instead. ptr[T] is the same as ptr T, pointer is just the untyped pointer, like C's void*.

andrea (orginal) [2015-02-02T17:06:14+01:00] view original

Ok, thanks, that clears it up. Maybe this should be copied in the section "Reference and pointer types" of the manual

jboy (orginal) [2015-02-11T14:41:36+01:00] view original

Hi, I hope this isn't bikeshedding --

I'm still getting comfortable with type classes vs parameter constraints. (I'm not even looking at the still-in-development user-defined type classes yet.)

A suggestion/proposal/request:

As a beginner, it would be somewhat easier for me to learn & grok the different keywords & their uses if the built-in type classes had names like PtrType, SeqType, ObjectType, TupleType, rather than overloading ptr, seq (which I expect to be generic types with a type parameter) and object, tuple (which I expect to be type-definition keywords). The manual already contains an example that uses this style: type RecordType = tuple or object.

I see there is also a SomeInteger type class. I would find IntegerType (or SomeIntegerType, but it's probably longer than necessary) more helpful as a type class name; SomeInteger makes me think of an integer value rather than an integer type. (And the macros module uses "some" in the same way, eg someProc for a parameter that takes a proc.)

A question:

In the section on type classes, is there any difference between use of or in type RecordType = tuple or object vs the use of | in proc onlyIntOrString[T: int|string](x, y: T) = discard, other than their different usage contexts?

(I can see that or is used in a type class definition, while | is used in a generic parameter type constraint. But the definition of the type class SomeInteger in the system module uses | rather than or.)

Another question:

Is there any reason why array and seq are listed in the table of type classes, but range is not?

(I assume that all three of them are generics defined in the system module, and thus they all define type classes automatically, by virtue of being generics -- is this understanding correct?)

A final thought / alternative proposal for type classes:

If generics automatically define type classes, then the built-in type classes (hopefully with names like ObjectType, TupleType, IntegerType) will be named in a different style to the type classes for user-defined generic types (Matrix, etc).

For consistency of syntax across all type classes, an alternative approach might be to use a prefix sigil to convert a generic type (like range, ptr, Matrix) or keyword (like object, tuple) to a type class identifier. It could be considered analogous to "escaping" the symbol to become a type class.

Since the programmer might be typing several of these type classes in a row, as they define a new type class or specify a generic parameter type constraint, the sigil should ideally be just a few keystrokes. By analogy with Lisp and Scala, you could use the single-quote, ': type RecordType = 'tuple or 'object.

How would this approach strike people on this forum?

Thanks for your patience with these thoughts & questions...

andrea (orginal) [2015-02-11T16:03:59+01:00] view original

It seems to me that in fact it is true that generic types define typeclasses. This seems to arise from auto generics. Let me give a simple example:

type Rational[A] = object
  denom, numer: A

proc `///`[A](x, y: A): Rational[A] =
  Rational[A](denom: x, numer: y)

proc `+`(a, b: Rational): a.type =
  Rational(denom: a.denom * b.numer + a.numer * b.denom, numer: a.numer * b.numer)

let
  x: Rational[int] = 3 /// 5
  y: Rational[int] = 7 /// 8

echo "sum: ", x + y

Notice how I defined the sum. Another possibility would be

proc `+`[A](a, b: Rational[A]): Rational[A] =
  Rational(denom: a.denom * b.numer + a.numer * b.denom, numer: a.numer * b.numer)

I think (more experienced people can confirm) that the first version is essentially a shortcut for the second one - that is, generics are inserted automatically.

This makes Rational essentially equivalent to what in Scala would be called Rational[_], which is itself a shortcut for Rational[A] forSome A.

If this is the case, I think that using ptr, object and so on as name for both typeclasses and generic types would be consistent, and in fact would not be a special case, but rather the normal behaviour even for user-defined generic types.

Araq (orginal) [2015-02-11T16:08:47+01:00] view original

@andrea: exactly.

OderWat (orginal) [2015-02-11T16:13:46+01:00] view original

I am definitely happy that I could escape the Rust 'lifetimes. So I would rather not like if a single single quote would be included into the Nim Syntax. Let's use § (shift 3) just to illustrate that Nim has strong German(-Keyboard) supporters ;)

jboy (orginal) [2015-02-11T16:56:38+01:00] view original

Hi @andrea:

Yes, when I said "by virtue of being generics" in my post, I was referring to this section of the documentation on type classes:

"Furthermore, every generic type automatically creates a type class of the same name that will match any instantiation of the generic type."

My question was whether array, seq and range are fundamentally implemented as generics (and thus fundamentally handled under this rule for generics), or whether they are handled specially as built-in types. (For example, ptr and ref already seem to be handled specially, because you can write ptr T as an alternative to ptr[T].) And either way, why was range not included in the list with array and seq? Was it intentional, or an unintentional omission? Is range different somehow?

And I agree with you that in the current language spec, using the names of generic types as the names of their automatically-defined type classes is consistent with the spec.

My point was that it seems confusing to beginners -- It was confusing to you; it was confusing to me; I suspect it will be confusing to others. I think we can agree that it takes quite a while to read through the language manual and the system module documentation. When a beginner has not yet read & digested the section on type classes, seeing ptr without a type parameter (or any other generic type without a type parameter) will probably be inexplicable & confusing.

Hence, I suggested that it might be more obvious as PtrType or 'ptr. A beginner would see this and understand "Ah, this is something else. I don't need to worry about this yet."

Likewise when type T = object or tuple is encountered, and the beginner is perplexed because neither an object type nor a tuple type is being defined.

I'm just trying to make some helpful suggestions to round off some sharp corners of the language. :)

jboy (orginal) [2015-02-11T17:01:22+01:00] view original

@OderWat: I fear you might be too late: Nim already uses single quotes for integer literal type suffixes... :)

OderWat (orginal) [2015-02-11T17:10:02+01:00] view original

@jboy I was unsure about seeing something like that. I guess my brain was struck by pain and it erased this memory at once to save me from going mental ill. But then this happened:

# Diverse Shortcut Definitionen

proc `-`(s: string) =
    echo s[1..s.high]

proc a(s: string) =
    echo s

proc b(s: string, x: int) =
    echo($ x & ". " & s)

proc c(s: string, a, b: int) =
    echo($ a & "/" & $ b & ". " & s)

-("Test")
-"Test"
- "Test"

a"Test"
a "Test"
a("Test")
"Test".a

"Test".b(1)
"Test".b 2
"Test".b 3 + 1

c "Test", 1, 2
c("Test",1,2)
"Test".c(1,2)

I survived barely but it made me stronger!

LeuGim (orginal) [2015-02-11T17:59:41+01:00] view original

OderWat: It's one of Nim features that attracted me the most :)

jboy (orginal) [2015-02-11T21:10:43+01:00] view original

I've been thinking some more about this type class syntax issue. Please allow me to try one more time with another suggestion for type class syntax --

I've been judging my ideas according to the following criteria:

It should be immediately obvious to someone who knows Nim, from the very first token of the type class, that they are reading a type class. Relatedly, it should be clear to a beginner that what they are looking at is not just a generic with the parameters elided.

A single-class type class should use the same general syntax as a multiple-class type class.

Ideally, use familiar Nim syntax in an intuitive fashion.

Ideally, stick with the type class identifiers listed in the manual (identifiers that correspond to type/variable/proc-definition keywords) rather than defining new identifiers like ObjectType.

Since there might be several type classes combined in a multiple-class type class (eg, object or tuple or proc or ref or var), each additional class in a multiple-class type class should incur minimal extra typing beyond the name of the additional class itself.

Avoid single quotes if possible (for @OderWat ;) )

I've come up with an approach that fulfils all the above criteria -- and I believe this approach would improve Nim (slightly) while also making type classes easier to learn:

A type class is defined as a set of typedesc.

(Or, more precisely, a type class is defined as a set of typedesc-matching boolean predicates, with set membership defined as returning true if any of the boolean predicates returns true... but you get the idea.)

Examples:

So, for example, instead of this example from the type classes section:


  type RecordType = tuple or object

you would write:


  type RecordType = {tuple,object}

If you just want to match any object type, the type class would be:


  type AnyObjectType = {object}

Observe that the syntax is clearly different from the start of an object type definition.

Further down in that same section:


  proc onlyIntOrString[T: int|string](x, y: T) = discard

would become:


  proc onlyIntOrString[T: {int,string}](x, y: T) = discard

Aside #1: If a type class is a set, we should be able to use the usual set operations such as +, *, -, in, notin, etc.

In particular, the is operator (T is string) can be defined to be a shortcut for a set operation (T in {string}). This enables a convenient way to express the more-general case of comparing the type against multiple types in one expression: T in {string,int,object}.

Aside #2: The syntax proc onlyIntOrString[T: int|string] (or even now proc onlyIntOrString[T: {int,string}]) could be expressed in terms of Nim set operators as proc onlyIntOrString[T in {int,string}]. I think this syntax is cleaner and more consistent with other Nim syntax, and would be more intuitive for beginners and casual readers.

Back to the examples...

The next example in that section:


  proc `==`*(x, y: tuple): bool =

would become:


  proc `==`*(x, y: {tuple}): bool =

The generic Matrix example would become:


  proc `[]`(m: {Matrix}, row, col: int): Matrix.T =

And finally, the "When a generic type is instantiated with a type class instead of a concrete type, this results in another more specific type class" example:


  seq[ref object]  # Any sequence storing references to any object type

would become:


  seq[ref {object}]  # Any sequence storing references to any object type

Details:

An int can never be a float or an object or a tuple, so there won't be any overlap amongst the built-in type classes that represent the built-in type -- only one of them can be true at any time. Hence, a set of typedesc-matching boolean predicates, that returns true for membership if any of the predicates returns true, seems like an appropriate model.

However, when you bring user defined type classes into consideration, I can see how you might want multiple type classes to be true at the same time: iter: {object} and {Indexable} and {Incrementable} or iter: {object} & {Indexable} & {Incrementable}. The and or & operator (whichever is preferred) would perform a boolean AND of the individual type class membership tests.

Does this make sense? Is there anything obvious I've missed? Does this proposed syntax for type classes appeal to anyone else?

OderWat (orginal) [2015-02-11T22:26:18+01:00] view original

But there are suddenly {} which are not needed currently. Why again should this be changed? And if that must change can I have the ' instead of {} please?

For real: I think that adding glyphs for anything which can be inferred and/or parsed without them looks not like something that was a goal for Nim so far. I have the impression the language is about "avoiding" glyphs, at least to make them optional.

I am very new to the language but I really like that I hardly need to type glyphs in most cases. Thats the beauty and I agree with LeuGim about this, even if I made some fun of it.

Changing trivial cases to use glyphs, to me, is worse than extending some names like AnythingTypeClass which the IDE or my fingers can create in far less time than putting something in {those} while its already in [these]!

jboy (orginal) [2015-02-12T05:42:13+01:00] view original

Hi @OderWat, what I'm proposing is:

a change to the syntax of a language feature (built-in type classes corresponding to built-in keywords, and automatically-defined type classes corresponding to generic types)

to make the syntax less confusing (for example, the ambiguity of ptr or seq appearing without a type; or the overloaded use of type T = object or type T = tuple when you're not actually defining a new type)

to a new syntax ({ptr}, {seq}, {object}, {tuple})

that is clearly a different token to the original keyword or generic type (and thus, much less ambiguous than the current syntax)

that still retains the only positive of the current syntax (which is that there is a clear connection between X the keyword or generic type, and X the type class -- which I propose would now become {X})

and that is also familiar Nim syntax (the set literal {X,Y})

with an appropriate analogy in this situation (ie, you are testing whether a type is within a set of types).

It occurs to me to make this syntax suggestion because:

This thread began due to confusion between ptr the generic type (which takes a type parameter to be instantiated) and ptr the type class.

@Araq replied: "ptr without anything is a typeclass, perhaps it would have been better to use SomePtr for that instead."

I also found this language feature more confusing than it needed to be when I first encountered it, due to the overloaded syntax -- and I've spent hours digesting the relevant sections of the manual and the system module (generics, type classes, user-defined type classes, the is operator, the type operator, static[T], typedesc and parameter constraints; I still have all the browser tabs open at each section) -- and this is after coding in Nim every day for the last month, and 16 years of Python, C++ & C (and some others) before that.

I tutored C++ & C to university students for a few years, so I have a some impressions of the sorts of things in a language that cause confusion to beginners. ("The same syntax doing subtly-different things in very-similar contexts" would be one of them.)

I think the {ptr} syntax is superior to the 'ptr syntax that I initially suggested, because:

It's a re-use of existing Nim set-literal syntax, rather than needing to introduce a new language-standard sigil '.

The meaning is consistent with the meaning of a set (eg, "Is this type within the set of ptr types?").

It enables combination of type classes very easily and naturally, as multiple elements in a set literal, separated by commas.

And finally, I think the {ptr} syntax is superior to the SomePtr or PtrType alternative, because:

It doesn't require new types to be defined in the language spec for all the built-in types -- instead, it simply re-applies an existing syntax in a new (but conceptually-similar) context.

It doesn't require the user to remember whether the new identifier is SomePtr or PtrType or SomePtrType or PtrTypeClass, etc.

It extends naturally & automatically to user-defined generic types (eg, Matrix) without requiring a different syntax pattern for the corresponding type classes (ie, also Matrix) to the built-in type classes (PtrType or whatever).

Anyway, this is all just a suggestion. I'm proposing what I think would be an improvement to the language syntax, to make it easier to learn, without becoming dumbed-down or constraining for experienced users. I appreciate the feedback. :)

LeuGim (orginal) [2015-02-12T10:42:20+01:00] view original

Allowing set operations on types is very good, but and is allowed now already (not written as &), not too:

type
  A = int or float
  B = int or bool or char
  C = A and B and not char
proc p(v: C) = echo ord(v)
p 5
# these do not compile
#p true
#p '5'
#p 5.0

Comparing {ptr} and SomePtr, SomePtr may be faster to type (though there's 2 Shift's too), but {ptr} is easier to read and has not to be remembered.

Araq (orginal) [2015-02-12T11:44:30+01:00] view original

@jboy I like your idea, but

Typeclasses are not really sets (A and B cannot be modelled as a set)

proc p(m: {Matrix}) is not as sexy as proc p(m: Matrix).

{T}{lit} is even more confusing than T{lit}, so you better also change the syntax for syntactic constraints.

And finally "It should be immediately obvious to someone who knows Nim, from the very first token of the type class, that they are reading a type class" reads for me like "oh, this feature is unfamiliar and potentially DANGEROUS / confusing " (which feature isn't btw?) " and so I like some training wheels here". Training wheels usually backfire after a week ("this is just stupid. cannot the compiler figure it out on its own that I'm using typeclasses here?!")

Rust's ! suffix for macros comes to mind. It's only annoying for those of us who can deal with slightly different substitution rules. ("Yes, it's a macro, not a fn, I get it. How often do I need to remind my readers that try! is a macro?! Don't beginners become experts at one point in their life?")

jboy (orginal) [2015-02-13T10:48:33+01:00] view original

@LeuGim & @Araq, thank you for your feedback!

As you both point out, a set is unable to model more than one of the and, or and not boolean operators simultaneously. I can't dispute this point. This may be a deal-breaker for my set-syntax proposal.

[One possible way forward could be to extend the proposed syntax to allow parenthesised boolean combinations of sets (to an arbitrary depth of parenthesis recursion) -- but I think this would spoil the elegance of the set metaphor and syntax -- and of course, it would also be reasonable to wonder why bother introducing the set syntax at all, if boolean operators are still being used.]

Also @Araq, my comment that "It should be immediately obvious to someone who knows Nim, from the very first token of the type class, that they are reading a type class" is more about avoiding ambiguity in syntax as much as possible. As much as possible, I would prefer to be able to read code in a single pass, with minimal backtracking (even if it's just a token or two). I think this preference might be similar to how the Nim compiler would prefer forward declarations for compilation efficiency. :)

Mirror of forum.nim-lang.org

823 :: Pointer types