My Nim code:
import strutils
for ln in stdin.lines:
echo ln.split('\t')[4]
And the Python code:
import sys
for ln in sys.stdin:
print(ln.split('\t')[4])
Tested on a TSV file of 1M lines, the Nim version (compiled with -d:release) consistently uses more than twice the time required by the Python version.
Is it because of split() or something else?
2. split in each loop, you are allocating 1 string for the line, then 4 more strings for your split and 1 sequence to hold those strings. Memory allocation is accompanied with reseting your strings and sequence to binary zero. This is a recipe for slowness. Python is faster because its GC reuses already allocated memory.
Unfortunately, following the Python code works for quick scripting but is a performance pitfall. If you want a fast csv/tsv parser you can use the tips from this blogpost https://nim-lang.org/blog/2017/05/25/faster-command-line-tools-in-nim.html
I should note that the issue with split is true for all languages with "vanilla" memory management (C or C++ as well require the same techniques).
This might change in the future if strutils is rewritten with more in-place procedures and the dup and collect macro for the functional high-level API without its cost (see v1.2.0 announcement https://nim-lang.org/blog/2020/04/03/version-120-released.html)
With -d:danger, it improves a bit but still much slower than the Python version.
Nim: 2.65s, Python: 1.26s
Just found out something people haven't mentioned yet (or I missed it?). According to https://nim-lang.org/docs/system.html echo is equivalent to a writeLine and a flushFile, so each time I call echo it flushes the output, which is not necessary. After I replaced echo x in my code with writeLine(stdout, x), it helped quite a bit.
Using using split iterator instead of assigning split() return to a variable helps too.
If you are not redirecting to a file but letting it display to the terminal then a big variable is what GUI terminal emulator you are using - XTerm, rxvt-unicode, st, etc. I think that could create a lot of varying numbers..why it may even depend on if you use TrueType fonts or cheaper to render fonts.
Not sure what the goals are here, but if you really want to figure out what's going on at the core Nim IO layer then you want clean separation of sources of time spent (like input/split IO and output IO). E.g, forget about locking stdout buffers - just don't output at all. And on the output side forget about reading input - just generate the data like your generator and output and realize that if you are outputting to a terminal then you are benchmarking how that terminal handles extreme high update rates.