The coding guidlines recommend against using nil as a string value. Is there a recommended way of representing something that is an optional string?
For example, I have a proc that tries to compress a string. If the compressed version is smaller than the string, it returns that in a string, otherwise it needs to indicate that the string isn't compressible. I've been using nil for this, but it doesn't feel right, anyway.
I could implement something like ML's option type with a variant. This would take the space of the variant as well as the pointer to the string, and be less efficient. I also suspect this type not being in the standard library suggests maybe there is a better way to do this in Nimrod.
Thanks, David
var value = "my string to compress"
if compress(value):
# `value` is compressed
else:
# `value` is not compressed
if( fopen(somefile)) etc
Beside the Option type, I like to see Either type in stdlib.
These two type (Option and Either) makes interfaces descriptive and interface shouldn't be changed after release. So, we need these two types in earlier stage.
In reality, in the course of implementing a fuse's nim-binding, I need Option type for integer parameter that might be "invalid". We may be able to use 0 or other sane default to express emptiness but that's more bewildering than using Option type.
If there is more elegant language-level solution to resolve this problem, I will prefer it but for now, defining these types and merging into stdlib is reasonable for me.
Hi,
I know there are some third party libraries that implement the type (including yours)
https://github.com/superfunc/monad https://github.com/flaviut/optional_t
The reason they are implementing the same thing is that those types are actually demanded by Nim codes thus strong reason for going into stdlib. Again, stdlib should support standard Option/Either types so that nobody will reinvent the wheel.
There is another one that implements either
https://github.com/fowlmouth/nimlibs/blob/master/fowltek/either_t.nim
Another vote for option here, but with a caveat: how does it impact performance? I would be worried - say - to introduce overhead due to boxing. There is no problem when one has the choice, but once this gets into the standard library, it might impact other modules and soon become depended upon.
The more general problem is this: on the one hand Nim helps writing in a slightly functional way (map, filter, options and so on). On the other hand, the very reason why I am using Nim is that it allows to write at a low level and have precise control over memory and execution time. For this reason, I very much prefer abstraction that do not have runtime costs. Nim has a lot of those, with respect to C, such as modules, static polymorphism, macros, compile-time computations, static[T], a good type system, and so on. Nim also has some other features that introduce some distance from the machine: the optional GC, dynamic polymorphism, array bound checking and so on.
My impression is that there is a tension here. For some situation, the second set of abstractions is good enough, for other situations it is not. If these abstraction start appearing in the standard library, less and less of the standard library will be usable in a C-like (embedded, driver, kernel, HPC) context. On the other hand, it is nice to benefit from these abstractions when possible.
I think that probably a reasonable choice would be to make a subset of the standard library independent from costly features, and mark it as safe for C-like use, and have a larger standard library with more features. For instance, one may want both a set of mutable collection and another set of immutable ones, in the style of Okasaki.
If this was the case, I would be happy to have Option and Either in the standard library.
Another vote for option here, but with a caveat: how does it impact performance? I would be worried - say - to introduce overhead due to boxing. There is no problem when one has the choice, but once this gets into the standard library, it might impact other modules and soon become depended upon.
Yes, this is a real problem. IMHO Option[T] is only annoying: Worse performance than exceptions in general, enforces often pointless checking (if your program distinguishes between the very existance of an environment variable or whether its contents are empty, your program sucks), doesn't compose unless you use monads (which hide the inconveniences but not the performance overhead), introduces yet another way to do things (exceptions vs returning -1 vs returning Option[int]). And note that even Haskell - the king of monads! - has exceptions and doesn't return Option[T] for e.g. "out of memory".
I think that probably a reasonable choice would be to make a subset of the standard library independent from costly features, and mark it as safe for C-like use, and have a larger standard library with more features. For instance, one may want both a set of mutable collection and another set of immutable ones, in the style of Okasaki.
But the stdlib embraces the GC. There are few modules you can use with --os:standalone or --gc:none. Many argue that makes the language unsuitable for low level programming, but I disagree:
Firstly, by that logic, C would have absymal string handling performance because you "have to" use strlen everywhere.
And secondly, the default GC provides more control over memory than a C program that uses malloc and free. Take for instance the freeing of a big tree structure: By default in C it is uninterruptible whereas in Nim with its GC it is!
I am not claiming that the GC is incompatible with low-level programming, but certainly Nim appeals to developers who have to work in an environment where deterministic reclaim of memory is mandatory. Often the problem is not that of performance but of predictability (real-time programming), security (cryptography) or memory limitations (embedded programming).
I am fine with using the GC in many cases, but it would be a real plus to have a known subset of the library that can be used with --gc:none. Maybe when Nim is more well-known, it could start as a third-party project, like the OCaml Jane Street library
If your string compression function is an internal one, tuples and multiple assignment should do what you need:
# Dummy implementations
proc compressRawString(s: string): string = s
proc uncompressRawString(s: string): string = s
proc compress(s: string): auto =
let cs = compressRawString(s)
if len(cs) < len(s):
(true, cs)
else:
(false, nil)
let (compressible, str) = compress("foo")
if compressible:
echo "Original: ", uncompressRawString(str)
Note that you can also spell out the tuple type above, but for a purely internal function, that may not be necessary.
If this is an externally visible API, then I'd recommend against using option types. Option types have a number of purposes, but they do not provide by themselves a suitable level of abstraction or encapsulation for a public API. You cannot, for example, easily add additional functionality to an option[string] (such as checking for the compression ratio) without major rewiring. In this case, I'd recommend a type that properly represents a compressed string.
type CompressedString* =
ref object
compressed: bool
ratio: float
data: string
proc compressString(s: string): CompressedString =
let cs = compressRawString(s)
if len(cs) < len(s):
CompressedString(compressed: true, ratio: len(cs)/len(s), data: cs)
else:
CompressedString(compressed: false, ratio: 1.0, data: s)
proc uncompressString(cs: CompressedString): string =
if cs.compressed:
uncompressRawString(cs.data)
else:
cs.data
proc compressionRatio(cs: CompressedString): float =
cs.ratio
Null is the original option type, with its own assembly instruction, you should use it.
You could create a global variable:
let uncompressable: string = ""
str = compress("abc")
if str == uncompressable:
str = compress("abc")
if str.len == 0: