nimforum mirror - Nim's strutils.split() slower than Python's string split()?

krtekz (orginal) [2020-04-25T04:33:43+02:00] view original

My Nim code:

import strutils

for ln in stdin.lines:
  echo ln.split('\t')[4]

And the Python code:

import sys

for ln in sys.stdin:
    print(ln.split('\t')[4])

Tested on a TSV file of 1M lines, the Nim version (compiled with -d:release) consistently uses more than twice the time required by the Python version.

Is it because of split() or something else?

juancarlospaco (orginal) [2020-04-25T04:44:32+02:00] view original

Compile with -d:danger.

mratsim (orginal) [2020-04-25T04:50:16+02:00] view original

Your code is slow for several reasons:

You should wrap your code in a proc, otherwise all variables are globals and globals are much harder to optimize, in particular they have the same lifetime as your program so you can't reuse their memory

2. split in each loop, you are allocating 1 string for the line, then 4 more strings for your split and 1 sequence to hold those strings. Memory allocation is accompanied with reseting your strings and sequence to binary zero. This is a recipe for slowness. Python is faster because its GC reuses already allocated memory.

Unfortunately, following the Python code works for quick scripting but is a performance pitfall. If you want a fast csv/tsv parser you can use the tips from this blogpost https://nim-lang.org/blog/2017/05/25/faster-command-line-tools-in-nim.html

I should note that the issue with split is true for all languages with "vanilla" memory management (C or C++ as well require the same techniques).

This might change in the future if strutils is rewritten with more in-place procedures and the dup and collect macro for the functional high-level API without its cost (see v1.2.0 announcement https://nim-lang.org/blog/2020/04/03/version-120-released.html)

krtekz (orginal) [2020-04-25T04:52:06+02:00] view original

With -d:danger, it improves a bit but still much slower than the Python version.

Nim: 2.65s, Python: 1.26s

krtekz (orginal) [2020-04-25T05:04:30+02:00] view original

Thanks a lot! Total newbie here

cblake (orginal) [2020-04-25T07:15:41+02:00] view original

This comes up a lot. Here are a couple recent ones: https://forum.nim-lang.org/t/4738 and https://forum.nim-lang.org/t/5103.

krtekz (orginal) [2020-04-25T15:58:03+02:00] view original

Just found out something people haven't mentioned yet (or I missed it?). According to https://nim-lang.org/docs/system.html echo is equivalent to a writeLine and a flushFile, so each time I call echo it flushes the output, which is not necessary. After I replaced echo x in my code with writeLine(stdout, x), it helped quite a bit.

Using using split iterator instead of assigning split() return to a variable helps too.

Araq (orginal) [2020-04-25T18:38:41+02:00] view original

Also try --gc:arc --panics:on

kaushalmodi (orginal) [2020-04-25T19:03:16+02:00] view original

@krtekz Awesome! Also make sure you share your optimized final code here for anyone who ends up on this thread in future :)

krtekz (orginal) [2020-04-25T19:15:47+02:00] view original

See updated code above

cblake (orginal) [2020-04-25T22:11:59+02:00] view original

Locking can indeed slowdown output as can exception handling and is unnecessary for single-threaded output and the exceptions may be unhelpful if there is nothing you can do on out of space errors. If you are on Linux/glibc then you can also just use fwrite_unlocked directly as in cligen/osUt.nim:urite or cligen/mslice.nim:urite as shown in the examples/cols.nim program in https://github.com/c-blake/cligen

ksandvik (orginal) [2020-04-25T22:43:45+02:00] view original

You could also concatenate all entries to a string variable and then do a single writeLine to avoid I/O traffic -- a classic optimization technique.

krtekz (orginal) [2020-04-26T00:32:27+02:00] view original

All my tests were done by redirecting stdout to /dev/null, so the performance of terminal doesn't matter.

cblake (orginal) [2020-04-26T13:30:20+02:00] view original

If you are not redirecting to a file but letting it display to the terminal then a big variable is what GUI terminal emulator you are using - XTerm, rxvt-unicode, st, etc. I think that could create a lot of varying numbers..why it may even depend on if you use TrueType fonts or cheaper to render fonts.

Not sure what the goals are here, but if you really want to figure out what's going on at the core Nim IO layer then you want clean separation of sources of time spent (like input/split IO and output IO). E.g, forget about locking stdout buffers - just don't output at all. And on the output side forget about reading input - just generate the data like your generator and output and realize that if you are outputting to a terminal then you are benchmarking how that terminal handles extreme high update rates.

krtekz (orginal) [2020-04-26T15:03:29+02:00] view original

You are right. That's why outputs are sent to /dev/null in my tests.

Mirror of forum.nim-lang.org

6253 :: Nim's strutils.split() slower than Python's string split()?