In a long running application i periodically call (linux) executables and collect their return code and output. Sometimes waitForExit excepts with "Invalid argument", but it still collects the stdout.
I tried to reproduce the crash. The application below crashes after a few seconds. When using less threads, it takes longer to crash.
Any tips? Did i found a bug?
Nim Compiler Version 2.0.0 [Linux: amd64]
Compiled at 2023-08-01
Copyright (c) 2006-2023 by Andreas Rumpf
git hash: a488067a4130f029000be4550a0fb1b39e0e9e7c
active boot switches: -d:release
sleep202308173453.nim
import os, strutils
let p = paramStr(1).parseInt
# echo paramStr(1)
sleep(p)
quit 0
waitForExiterror202308172930.nim
import osproc, random, os
randomize()
proc th(foo: bool) {.thread.} =
  while true:
    var ra = rand(1000)
    var rb = rand(1000)
    var rc = rand(100)
    let cmd = getAppDir() / "sleep202308173453 " & $ra
    var pr = startProcess(
      command = cmd,
      options = {poEvalCommand}
    )
    sleep(rc)
    try:
      let exitCode = pr.waitForExit(rb)
      var output = ""
      for line in pr.lines():
        output.add line & "\n"
    except:
      echo "CRASH"
      echo getCurrentExceptionMsg()
      echo "ra:", ra
      echo "rb:", rb
      echo "rc:", rc
      quit()
    pr.close()
var threads: array[127, Thread[bool]]
for idx in 0 .. threads.len-1:
  createThread(threads[idx], th, true)
while true:
  sleep(1000)
I'd guess you're hitting some kind of system limit. I have this status line script - https://pastebin.com/4Sz0wr21. Sometimes IO operations just fail, though in my case error is less unambigious, ExceptionMsg is "OS error: Bad file descriptor". It appears in log file once in 24 hours, but it could fail after couple seconds/minutes if I lower the sleep amount.
So I catch an exception and ignore it.
proc exec_cmd(c: Command) {.raises: [IOError, OSError], thread.} =
  while true:
    try:
      let
        (stdout, _) = execCmdEx(c.cmd)
        parsed = stdout.strip().split('\n', 1)
      c.display = parsed[0]
      if parsed.len > 1 and parsed[1].isColor(): c.color = Color(parsed[1])
    except IOError, OSError:
      dumplog($getTime() & " " & c.cmd & " Failed! MSG: " & getCurrentException().msg)
    sleep(c.interval)
In your case, you should probably restart process on exception.
pr.peekExitCode indeed returns the correct exit code (if not killed by signal).
Then maybe i could even go without restarting the application.