I have this piece of code:
...
var filesize: int
try:
filesize = cast[int](getFileSize(path))
except OSError:
echo "Error when checking the filesize"
return
...
But, when I open a lot of files a got the message and the program crash:
Exception message: Too many open files
Exception type: [OSError]
How to keep the program running?
OS have a limited number of open files to prevent denial of service attacks.
You need to either close files or modify the max number of open files for example using ulimit on POSIX or editing sysctl.conf.
Minor correction: the original reason for limiting open files was to trap leaks (missing close s), not to prevent DoS attacks { though, like many things, one man's buggy program is another's DoS attack program :-) }.
@mrhdias very likely has just such a leak which he may be able to track down manually by just finding the point in his logic where he is not close() ing when he should, using some defer near the open, etc., though @spip's idea is also elegant.
Also, if he does happen to just need many open files, while sysctl.conf or shell ulimit are viable, they may also create undue "deployment sensitivity". It is also usually possible for a process to raise its own limits quite a lot (for stack memory and other things, too). In Nim this is provided as posix.setrlimit. So, @mrhdias could also raise those at the start of his program.
Another possibility, depending upon @mrhdias' needs, may be to use Nim's memfiles which will close the file handle/descriptor { unless allowRemap==true }. Default per-process limits on the number of file memory maps are usually much more generous and he may (or may not) speed up his IO at the same time.
Since you are serving static files these are read-only files. So, just switching to memfiles.open should do it. On Linux the default value of /proc/sys/vm/max_map_count, the per-process limit on memory maps is 65530.
Nim's memfiles module will close as soon as it opens, as already mentioned. I do not know the limits for other OSes off hand, but they are probably similarly more generous than open "file handle/descriptor" limits. You will need to adapt your code that access the file data to use MemFile.mem, of course. E.g., let cstr = cast[cstring](mf.mem) and being very careful to not overflow mf.size is not so hard.
My question is: how to prevent the program from crashing, even if the error happens?
If the error happens, the program continues to run, and no longer accepts more connections until the number of open files decreases.
In many programas an error occurs and raise an exception, but the program continues to run, don't crash!
Strange because if I fix the filesize it doesn't crash!
# let filesize = cast[int](getFileSize(path))
let filesize = 104
let file = openAsync(path, fmRead)
file.close()
So the problem has to do with the getFileSize functionWell, ok. Your ulimit said 2048 fds but Bash (I assume?) only reports hard limits. So, you should do that cat /proc/$$/limits to see both. In this case, were someone to set their limit to 2048 and run your wrk example they would not reproduce the failure..it would just work. :) But there is another method to my madness, as the saying goes, for mentioning setrlimit..One can maybe eliminate the need for wrk to reproduce and get better help.
Here is a slightly simplified program that should fail the same way on any (Linux) machine with wrk -t 1 -c 1 http://localhost:8080/someFile or even just curl http://localhost:8080:
import asynchttpserver, asyncdispatch, asyncnet, os, strutils, posix, memfiles
var fdLim: RLimit
discard getrlimit(RLIMIT_NOFILE, fdLim)
fdLim.rlim_cur = 5 # fdLim.rlim_max
echo "max-max fd = ", fdLim.rlim_max, " soft = ", fdLim.rlim_cur
discard setrlimit(RLIMIT_NOFILE, fdLim)
proc fileServer(req: Request, staticDir="") {.async.} =
var url_path = if req.url.path.len > 1 and req.url.path[0] == '/': req.url.path[1 .. ^1] else: "index.html"
var path = staticDir / url_path
if dirExists(path):
path = path / "index.html"
var mf = memfiles.open(path) # raises if path missing/etc.
await req.client.send(
"HTTP/1.1 200\c\Lcontent-type: text/html\c\Lcontent-length: $1\c\L\c\L" % [
$mf.size ])
var off = 0
while off < mf.size:
let data = cast[pointer](cast[int](mf.mem) + off)
await req.client.send(data, min(8192, mf.size - off))
off += 8192
mf.close # just munmap; fd was already closed
var server = newAsyncHttpServer()
proc cb(req: Request) {.async.} =
try:
await req.fileServer("static")
except OSError:
echo "Happen a Error!"
await req.respond(Http200, "Happen a Error!")
waitFor server.serve(Port(8080), cb)
You just need to mkdir static; ln someFile static first. I reproduced your failure on a very recent nim-devel build from this morning. That 5 limit comes from (stdin, stdout, stderr, epoll fd, listening socket, zero left for accepted connection).
Also, I can confirm that strace shows this is not a file descriptor leak. It's just natural exhaustion from too many simultaneous clients. The Nim exit message even says [OSError]. So, I don't know why your except OSError: catch fails only sometimes in some kind of race condition way. It also happens if I make it just except: without the OSError part (like your first posted program).
Meanwhile if you change 5 to 6 it works (as in prints "Happen a Errror") for wrk -t 1 -c 1 URI for 1000s of connections. memfiles.open needs at least 1 more fd. with 7 it works with no error at all for 1000s of connections.
It may well be possible to create a simpler bug reproduction eliminating all the async/networking activity, but I probably cannot spend more time on this right now. You should probably file a github issue on this (maybe with my simpler 5 limit/any connection at all => exception skipping settings).
Actually, if it helps anyone, even this fails in the same way:
import asynchttpserver, asyncdispatch, asyncnet, os, strutils, posix
var fdLim: RLimit
discard getrlimit(RLIMIT_NOFILE, fdLim)
fdLim.rlim_cur = 5 # fdLim.rlim_max
echo "max-max fd = ", fdLim.rlim_max, " soft = ", fdLim.rlim_cur
discard setrlimit(RLIMIT_NOFILE, fdLim)
proc fileServer(req: Request, staticDir="") {.async.} =
var url_path = if req.url.path.len > 1 and req.url.path[0] == '/': req.url.pat
var path = staticDir / url_path
if dirExists(path): path = path / "index.html"
await req.client.send(
"HTTP/1.1 200\c\Lcontent-type: text/html\c\Lcontent-length: $1\c\L\c\L" % [
"18" ])
await req.client.send("that's all, folks\n")
var server = newAsyncHttpServer()
proc cb(req: Request) {.async.} =
try:
await req.fileServer("static")
except:
echo "Happen a Error!"
await req.respond(Http200, "Happen a Error!")
waitFor server.serve(Port(8080), cb)
and then
terminal1$ ./bug4
terminal2$ curl http://localhost:8080/
terminal1 output:
max-max fd = 4096 soft = 5
<waiting; then after curl immediate program exit with:>
bug4.nim(26) bug4
/usr/lib/nim/lib/pure/asyncdispatch.nim(1934) waitFor
/usr/lib/nim/lib/pure/asyncdispatch.nim(1626) poll
/usr/lib/nim/lib/pure/asyncdispatch.nim(1367) runOnce
/usr/lib/nim/lib/pure/asyncdispatch.nim(208) processPendingCallbacks
/usr/lib/nim/lib/pure/asyncmacro.nim(22) serveNimAsyncContinue
/usr/lib/nim/lib/pure/asyncmacro.nim(139) serveIter
/usr/lib/nim/lib/pure/asyncfutures.nim(372) read
[[reraised from:
bug4.nim(26) bug4
/usr/lib/nim/lib/pure/asyncdispatch.nim(1936) waitFor
/usr/lib/nim/lib/pure/asyncfutures.nim(372) read
]]
Error: unhandled exception: Too many open files
Async traceback:
bug4.nim(26) bug4
/usr/lib/nim/lib/pure/asyncdispatch.nim(1934) waitFor
/usr/lib/nim/lib/pure/asyncdispatch.nim(1626) poll
/usr/lib/nim/lib/pure/asyncdispatch.nim(1367) runOnce
/usr/lib/nim/lib/pure/asyncdispatch.nim(208) processPendingCallbacks
/usr/lib/nim/lib/pure/asyncmacro.nim(22) serveNimAsyncContinue
/usr/lib/nim/lib/pure/asyncmacro.nim(139) serveIter
/usr/lib/nim/lib/pure/asyncfutures.nim(372) read
Exception message: Too many open files
Exception type: [OSError]
and it works (for 1 parallel connection) if you raise the fd limit to 6.
This makes me wonder if other OS errors than MFILE also get mis-handled...
PR is here, https://github.com/nim-lang/Nim/pull/15957
No idea if it's a good design but at least it makes sense to me.