I have written a small string formatting/interpolation library. In case anyone is interested:
https://github.com/bluenote10/nim-stringinterpolation
I know that there already is another string interpolation library. However, I wasn't fully happy with it (not printf standard, and there are limitations for the possible expressions in a string). I'm using a slightly different approach (the same as Scala), which is easier to parse, and as far as I can see every possible expression should be allowed. Internally I have wrapped snprintf with lots of safety checks added. I also tried to come up with good compile-time error messages to make it easy to see why a certain format string is invalid. This was my first real use of macros so feedback is welcome!
Are there actually plans to have some kind of printf-like string formatting in the standard library? Unfortunately, the current strutils does not offer a good solution when you have to format a bunch of ints/floats with width/alignment control. As far as I can see, all formatting functions rely on converting the arguments to string during the function call. As a result, a simple "x = $x%6.3f" would mean that I have to do several function calls, I think the shortest form is "x = $1" % [x.formatFloat(ffDecimal, 3).align(6)]? This is complex enough that I would probably never format my output in practice. And I haven't found other things like left-aligning or sign prefixing yet. In my opinion a language should encourage string formatting (i.e., make it as easy as possible) to make output more meaningful in general. I also remember when Scala introduced string interpolation in 2.10: Soon after adopting string interpolation all my log files became so much cleaner, and I definitely had cases where I found an issue faster because the log output was "easy on the eyes".
To clarify: This is not meant as an advertisement for my library :). The other string formatting library would be perfectly fine as well (it has even more features then printf, like string centering). However, if the reason not to include it is because it is too complex or not printf conform, there now is a simpler alternative :).
We have strfmt in nimble for Python-like string interpolation/formatting: https://bitbucket.org/lyro/strfmt
Not the same printf syntax, but close enough maybe. It also tries to parse the formatting at compile time.
var x = 10
echo "x = $x" #>x = 10
@mapdog: I do not fully understand what you mean by "skip parameters at the end". Your example is pretty much the main purpose of this library:
let x = 10
echo ifmt"x = $x"
@renesac: Sorry for the confusion, I was referring to strfmt by "another interpolation library". It is indeed pretty good, but has a few drawbacks:
I mean printf style with multiple parameters after the string:
printf("x = %i", x)
I didn't notice interp before.
Maybe to clarify: My library provides two approaches. You can either use string interpolation e.g.
let s = ifmt"Vec(x = $y, y = $y, z = $z)"
or the more traditional: let s = format("Vec(x = %f, y = %f, z = %f)", x, y, z)
In ifmt, you can use both plain identifiers and expressions:
let s = ifmt"the value of someIdent is $someIdent"
let s = ifmt"""x is ${if x > y: "valid" else: "invalid because it exceeds y"}"""
The printf format suffixes can optionally be appended to identifiers/expressions.
what surprised me about formatting in strutils is that it doesn't do compile time argument checking. Usually when I run across languages with fancy compile time logic they'll (at compile time) examine the format string to see how many arguments it refers to. Then if you have a $3 in your string, but no third argument, you get a compile time error. It's not a huge advantage or anything, but it's kind of nifty, so if you're looking to do something fancy with string formatting and interpolation I'd start there.
Now what would be really intriguing is a compile time optimization that split apart the format string and turned the equivalent printf-ish statement and arguments into nothing but a series of stringification, then string concatenations. You could even give a compile time hint as to the size of the resulting string at runtime, like pre-allocate(len(formatstr)+len(whateverarg1isatruntime)+len(whateverarg2isatruntime)) or something.
Sort of like what C++ does with their iostream syntax, except not, uh, stupid.
Great job buddy! I like it so much!
I totally do not understand why there is no string formatting function in Nim's core and stdlib, it's a first-class thing in my point of view. The strutils.format is not really a string formatter 'cause it does no real formatting but just string concatenation.
And I love the C's printf-style which makes me comfortable. Type-safe is good, but not more important for me than convenience. I do not expect my program runs correctly just because it compiles, right?
@Araq Well formatFloat depends on c_sprintf and that uses the application locale "naturally". This is also true on Windows afaik. This is not the same as the shell or system locale by default. Why would you change this to something else (and then need a way to create a locale dependent routine anyway)?
Or let me ask: How to format data according to the locale in Nim?
Probably "you don't care" but even php -r "setlocale('LC_NUMERIC','de_DE'); echo 1.12;" outputs "1,12". I expect that from my language don't you?
On the contrary, it's not that "I don't care", it's that I hacked around this issue in other languages often enough to know it's a bad idea (x.replace(",", ".")). Floating point values can also come from config files etc. which are all in English even if my locale is German. A locale is a global variable that introduces non-deterministic behaviour in a misguided attempt to guess for me what my users might want.
I expect that from my language don't you?
I expect my language to not repeat old mistakes.
@OderWat: In addition to what Araq said, keep in mind that using environment variables to control this behavior will also leak through to subprocesses, with often interesting [1] results. Environment variables are really the worst kind of mutable global state to have core functionality depend on. The proper solution is to have a separate module that provides locale-specific functionality.
Keep also in mind that locales aren't a silver bullet. If you want language-specific code to work, you generally need to write specific code for each language. Fun example: In German, the 'ß' character is to be capitalized as 'SS', except in legal documents, where the capital 'ß' should be used, but only for names. In most locales, there are multiple different ways to write dates or to order strings lexicographically, and which one is to be used depends on the context.
[1] "Interesting" in the Chinese curse sense of the word.
Well... it is Nims usage of sprintf() what causes this behavior and I know about the implications using another locale. I also tripped over that often enough. But then the "global effect" is what the locale is for. Declaring this as bug is your right of course, but then fix it and hopefully offer another way to set a locale for Nim's standard library.
I know that setlocale() is supposed to work for the whole process. But not every software has a problem with that. If you use it "global" thats probably just the feature you want. AFAIK there are other ways to use your own temporary locale settings (newlocale()/uselocale()) with modern posix implementations.
About config files: Writing software for German business also means writing config files with German formatting. This also is true for csv for spreadsheets and so on. Of course you can x.replace(",", ".") everywhere. But you can also just set the locale to what is needed (globally) :)
@OrderWat
Nice job! u must be a big ASM fan that feel comfortable to write so complicated codes, and finally turn Nim into a far more cumbersome language than C. But what about code reusing? How can I implement a formatting capable log module with ur idea?
@Jehan
The printf-style does not conflict with your idea, just write some to_s procs for types that not directly supported by printf, which is exactly the same thing as u did by writing Formatters and helper procs, there is no extension problem at all. And I do not get ur point 2, since u still have to write so many tedious Formatters and helper procs.
@OrderWat
But you can also just set the locale to what is needed (globally)
No, that is exactly what I cannot do in systems not written solely by me which depend on this feature in unknown ways.
Also the replace hack is not about to be able to parse floating point numbers, but to toString them. For example db.query("insert foo(a) values (?)", $2.0) needs to always produce 2.0 and never 2,0 since the syntax of SQL does not depend on the locale...
@hibernating
It depends on what is needed. I don't think that it is very complicated code and don't see a problem with code reuse (to an extend). But if you want sprintf() from the standard lib you could just use that. I like to use pre-made libraries where they exist and fit my needs. Rewriting all of sprintf() or even strftime() in Nim is "a task to be done" and thats probably the reason why strutils uses c_sprintf() internally :)
To implement a formatting capable log I would probably just do that. Somebody has to do the ASM level stuff anyway. It comes down to the fact that not everything one needs is always available or optimal. Reinventing the wheel made it better over thousands of years and now we have bicycles and racing cars :).
@Araq
I see. But what is your proposed solution for it? Having both, "locale" specific and "we agreed on english long ago" functions would be optimal. So formatFloat() needs to be fixed (I just saw nimFloatToStr() and giggled) but how to implement "a locale" for Nim? Doing everything by hand?
About "parsing" or "stringify" ... your example was comma to point which in my eyes "parses" German and changes it to English. I frequently use "." to "" and "," to "." for reading data. When you use this after using sprintf()'ing a float to nullify the locale, thats how it looks to me in nimFloatToStr(), I would not call that toString() it... its just a (clever?) hack :)
The SQL example of course shows how fragile it becomes when every stringify works with the locale. Thats to avoid, no doubt about it!
I think the concept that Nims echo uses toString() is the "problem" as it defers the conversion to a place where even I don't want a locale setting influence the result.
Having both, "locale" specific and "we agreed on english long ago" functions would be optimal. So formatFloat() needs to be fixed (I just saw nimFloatToStr() and giggled) but how to implement "a locale" for Nim?
Yeah, we need both. But for the time being, I will add an optional decimalChar = '.' parameter to formatFloat.
@Araq That sounds good :) I was missing it and looked into the source and saw that it uses sprintf() and therefor I came up with the setlocale() :-).
To bad that it is not trivial to write a dtoa() and use that instead. But I saw some decent c-code for this in the past floating around.
EDIT: http://git.musl-libc.org/cgit/musl/blob/src/stdio/vfprintf.c?h=v1.1.6 (MIT License, was posted in SO)