Hello,
I'm new to Nimrod, so before I ask my question I'd like to congratulate the community on a powerful, approachable, forward-thinking language!
My question: I know about the "--cs:partial" command-line option, but I'm wondering if there are plans to extend it to address any of the the following use-cases:
Math: I often write forumlas in which λ_n and λ_N are intended to appear similar, yet be semantically distinct.
Python Iteroperability: It would be nice to be able to overload . to call things like __new__ or __len__.
JSON Interoperability: It would be nice to be able to overload .= to create JSON strings that conform to style-sensitive protocols (e.g. {"firstName": "Fred", "lastName": "Rick", "lastName_encoded": "Ichray"}).
Python Iteroperability: It would be nice to be able to overload . to call things like __new__ or __len__.
You can always use something like py["__new__"] = myNew for these (rare) cases. That said, identifiers are case preserving. If it happens to be a valid Nimrod identifier you should be lucky. So your JSON examples should all work.
That said, I'm not that happy with --cs:partial. I'd like to further distinguish FULLCAPS from fullcaps for better C interop. (And I really want --cs:none when hacking debugging code into the compiler or working in a REPL. ;-) )
OK. Is this likely to remain a compiler option, or will there eventually be a per-module case-sensitivity pragma, so I can write a library that uses case-sensitivity internally [1] that you can use from your case-insensitive REPL?
[1] e.g. some algorithm written with "λ_n ∈ {λ_0..λ_N}" notation.
I want to add something to this. I've always enjoyed Nimrod's style-agnostic "code your way" approach to case sensitivity and underscores.. however I had to move my project to --cs:partial awhile ago (and I'm very glad that's now an option) due to name conflicts between types & getters... for example:
type
Shader* = ref object
...
Material* = ref object
shader: Shader # private member
...
proc shader*(m:Material): Shader = m.shader # public getter
Without --cs:partial the getter and Shader type collide. However, while partial case-sensitivity works well to avoid these conflicts, I would actually prefer a better system of avoiding this and not need to rely on it (though I'll always keep my code very case consistent for potential use with it).
A simple solution to basic getters is something Araq has mentioned before as a potential future feature: + for "readonly" exposure. eg:
type
Material* = ref object
id*: GLint # public member
shader+: Shader # readonly member
...
However, that doesn't work well if you want anything more complex behind a getter (although I think the feature would be very convenient, and should be eventually added for that sake alone). What seems like the best solution here is to allow types and procs to share the same name, and choose "logical defaults" for what is chosen when, with more explicit distinction for the potential conflict areas such as object constructors. eg:
type Foo = ...
proc foo: Foo = ...
var a: Foo # uses type
var b: foo() # calls proc (if it returns a typedesc)
var c = foo() # calls proc
var d = Foo(:) # calls type constructor
Although I don't know if that syntax is good, and might be a bit confusing if there was a lot of symbols with the same names floating around. IDK exactly, these are just thoughts I'm throwing out there. Perhaps instead of allowing types/procs to share names, there could just be a better "getter" syntax which avoids these conflicts?
[EDIT] err... well I feel kinda silly since it looks like using backtics proc `.shader`*(m:Material)... syntax seems to avoid the name conflicts. However, whenever I try to actually use the getter, i get Error: conversion from Material to Shader is invalid, so there's still a conflict somewhere, but perhaps this is a symbol resolution bug?
That's not really the problem. It's not a name conflict between a member and a proc, but between a proc and a type within the same module. I could use Hungarian notation on either the proc or type (getShader or TShader) but at that point I would much rather just use --cs:parital.
Also the error from my [EDIT] was a false report. It doesn't avoid the type/proc name conflict even with backticks.
Opinions?
Well it allows for lambdaN vs lambdan for math people and allows FOOBAR for fewer name conflicts in C wrappers (though this happens rarely). But yeah, the current rule is much simpler.
In fact I think it's pretty much perfect as-is and would even argue for it eventually becoming default with --cs:none being the opt-in.
--cs:partial will be the default soon. Somebody needs to nimrod pretty all the Babel packages... ;-)
it allows for lambdaN vs lambdan for math people
Well on second thought these rules aren't too bad. Underscores would just be considered an alternative to capitalization, which is easy to explain..
In ALL_CAPS identifiers the underscores are ignored ... For multiple upper-cased letters only the first stays...
This is the confusing part I think. Since ALL_CAPS are really only useful for C-wrappers (and, as you say, rare) perhaps if this wasn't part of the change I wouldn't personally mind the change (considering it allows for fooBar/foobar distinction).
--cs:partial will be the default soon
Awesome :)
For multiple upper-cased letters only the first stays
er... i guess without this change as well it would be almost identical to --cs:full huh. So maybe leave this in but not the ALLCAPS rule (as it's a bit confusing when caps are handled differently)?
foobar -> foobar
fooBar -> fooBar
foo_bar -> fooBar
fooBAR -> fooBar
FOO_BAR -> FooBar
Though that doesn't seem much easier to learn than your original proposal. IDK, if I'm just being over-sensitive to rule complexity here. Maybe other's will like your original idea just fine. But I would get a lot of feedback on this idea before changing anything, especially if this is designed to become default. Last thing you want is new users getting hung-up on complex symbol rules who's only benefit (at-a-glance) might appear to be allowing for optional underscores.
Araq: Thinking about this more, I'm wondering, given your plan for --cs:partial as default, how that's going to even work.. if a module uses both Foo and foo, that completely breaks --cs:none.. and any "style guideline" which prevent this would negate the benefits of --cs:partial as default and just cause you to have to constantly explain to people why their (compiling) PR requests don't match the guidelines.
In all honestly, I'm not sure having two options is really a realistic solution, and I think --cs:none is showing it's flaws (seems lots of people, myself included, want to distinguish symbols by capitalization for various reasons). That said, I really like Nim's ability to interchange camelCase and snake_case, and I think there might be a solution which addresses both these concerns with one rule: make underscores an alternative to capitals, eg:
fooBar -> fooBar
foo_bar -> fooBar
FooBar -> FooBar
_foo_bar -> FooBar
fooBAR -> fooBAR
foo_b_a_r -> fooBAR
# the following could work, but might be better to just make 2 or more _'s invalid
foo__bar -> foo_bar
foo__Bar -> foo_Bar
foo___bar -> foo_Bar
Just a thought. I'm a little concerned about this, as I really need --cs:partial now, and I'm afraid of you removing that feature :| so if everyone wants --cs:none then I would vote to leave that as default and keep --cs:partial how it is now (or at least keep it simple to understand, like my suggestion with the underscores).
In all honestly, I'm not sure having two options is really a realistic solution
Well we can always dream up more complex rules for disambiguation ("in this context it can only be a type anyway") but I'm not a fan of this either. However I never thought it is a realistic solution and it has always been designed to enable a transition period. The default value of --cs defines the language!
which addresses both these concerns with one rule: make underscores an alternative to capitals
Nice idea. Your rule is certainly simpler than mine.
but might be better to just make 2 or more _'s invalid
They already are.