I'm a new user of Nim. I'm enjoying programming in it. It's an exceptionally elegant language.
But, currently I'm stuck with a problem: I haven't been able to find a way to read numbers (floats, ints, etc.) from an ASCII (text) file.
As a concrete example, suppose we have a text file like
3.14 2 41.00
where numbers are written in a blank-separated ASCII format. So, I'm looking for a function that reads the text file, parse it, and convert the text-representations of the numbers into the corresponding numbers, like C's scanf, C++'s ">>", Fortran's formatted read, etc. Or something like Ruby's String::to_f method or Haskell's "read" function.
Thank you for your help.
Ryo
3a. do ctrl+F 'scanf', so you will see: "strscans This module contains a scanf macro for convenient parsing of mini languages."
3b. use the search box on the left, and it will give you the direct link to https://nim-lang.org/docs/strscans.html
I use ragel for parsers runs on microcontrollers as it gives very clean debuggable switch code with -G2 option, and don't use anything besides two pointers for parsing buffer reading.
Thank you all for the help.
If I understand correctly, you have presented various parsers to convert a character string into numbers.
Then, my next task is to read the text file word by word.
My example above is a short string of blank-separated numbers. In reality, however, my file contains a huge number of numbers separated by blanks, tabs, or newlines, or any mixture of these.
Because my files can be very large, I can't simply separate parsing from reading.
So, I imagine that this kind of code would be what I would write:
var nums = newSeqfloat # open file
- while f.readWord(buf):
- nums.add(buf.atof) # convert string to float
The system module has readLine but not readWord . . .
I remember writing an equivalent of readWord in pure C decades ago by reading one character at a time. Is this what you have to do in Nim ? I hope there is a convenient function like readLine but for words.
The system module has readLine but not readWord
There are dozen of ways to do it and dozen of examples -- some are optimized for performance, some more for easy usage.
For example for lines of text, there is split iterator which can split the line into words.
Miram already recommends strscan module.
Maybe this can be helpful for performance:
https://nim-lang.org/blog/2017/05/25/faster-command-line-tools-in-nim.html
Rosetta-Code should also have examples, and Dom's book too.
If your string and the extracted results fit in your memory you can use the splitWhiteSpace iterator: https://nim-lang.org/docs/strutils.html#splitWhitespace.i,string,int
Otherwise write a simple parser using streams and readFloat32/readFloat64 (https://nim-lang.org/docs/streams.html#readFloat32%2CStream) and discarding the space/tabs/newline in-between.
In pseudocode (I may have missed something).
import streams
proc myReader*(path: string): seq[float32] =
let stream = newFileStream(path, mode = fmRead)
defer: stream.close()
while not stream.atEnd():
result.add s.readFloat32()
discard s.readChar
Thank you all for your responses.
If I'm not mistaken, all you say is: read the string and then parse it. But, my file may not have newlines at all and you don't know how large it may be (it may be stdin).
So, I need to read one "word" at a time and convert it to a float.
@mratsim showed a solution using streams, but as far as I can tell, readFloat32 and company read binary data in the file. (I tested the sample code, and got strange floating point numbers from a sample ascii file.)
What I'm looking for, therefore, is the ascii version of readFloat32 and friends, which automatically skip blanks/newlines/tabs , just like cin >> i of C++ or the formatted read of Fortran.
If there is no ready-made functions like that, I would need to write ones, but to do so, would I have to read a character at a time? Or is there a read function that stops at a specified delimiter?
Another related question: Is there an official mechanism to convert between objects and their string representations?
For example, in Ruby, there is the pair to_f and to_s:
"3.14".to_f # gives 3.14 (float) 3.14.to_s # gives "3.14" (string) "3.14".to_f.to_s # gives "3.14"
In Haskell, show and read form such a pair.
If you have such an official mechanism, you just define to_s or show for your own type (class) and voila the standard output functions (like echo of Nim) start to be able to print values of your own type.